激情六月丁香婷婷|亚洲色图AV二区|丝袜AV日韩AV|久草视频在线分类|伊人九九精品视频|国产精品一级电影|久草视频在线99|在线看的av网址|伊人99精品无码|午夜无码视频在线

一手測評丨Midjourney V6 上線,現(xiàn)在繪圖可以說人話了

發(fā)布時間:2024-03-25 16:45:09 瀏覽量:175次


省流:Midjourney 在2023年12月21日發(fā)布了 V6 版本,是 Midjourney 團隊從頭開始訓練的第三個模型。目前 V5.2 和 V6 之間暫無 benchmark 對比,因此本次大部分改進都是通過肉眼的感官去體驗。



1

Midjourney V6 新特性

據(jù)Midjourney的公告,重要更新為:

“對提示詞的理解更加準確,并支持更長更詳細的自然描述,同時輸出的圖像細節(jié)更符合真實邏輯”

“新增subtle 和 creative 模式,可輸出圖片分辨率提高了一倍”

“文字識別大幅進步,已經(jīng)可以準確的繪畫出簡單的文本信息”

一句話總結:目前最新版的Midjourney更傾向要你對它講人話,同時他會反饋給你更符合你想象力的圖。

Midjourney官方公告

在公告的內容中,官方著重提示,新版本的改進的“繪圖”方式與以往差別很大,杜絕使用“逼真、4K、8K”等無意義的“垃圾詞”。最好重新學習如何去“念咒”,過去很長一段時間已經(jīng)養(yǎng)成了“標簽式”提示詞使用習慣的用戶現(xiàn)在可以重新拾起自己的英語寫作能力。

但這并不代表標簽式關鍵詞就完全被否定。高Stylize的參數(shù)設置仍適合標簽式關鍵詞生成“風格強烈的藝術圖像”。

比如嘗試“描繪”出深圳平安金融中心,給出的圖還是很炫技的。



2

正式開始試用Midjourney V6

實測進入Midjourne V6 操作

在/setting指令中,目前已支持切換至V6[ALPHA]模型。

讓我們跑一下全新版本的Midjourney。首先就快速的看看V6現(xiàn)在細節(jié)的處理能力。

財務拍給我她的喂貓日常,與我今天的AI喂貓

向公司財務要了一張她兒子王小胖的日常照,然后根據(jù)這張照片的信息進行Ai繪畫。

將王小胖的圖片上傳到任意圖床,最便捷的方式就是在Discord聊天中發(fā)送。以此可快速的生成一個圖床提供給Midjourney用于理解我的想法(墊圖準備完成)。

在V6版本中,在墊圖的基礎上直接陳述腦子里所想的畫面。

最后生成的圖已經(jīng)不再是“一眼AI”。

在語義清晰,對圖像細節(jié)描寫詳盡的基礎上,AI圖像和原相機之間已經(jīng)零距離。

首次可以清晰的把文字完整的加入AI繪圖中

單獨測試Midjourney V6的文字生成能力

站在設計師的角度,隨手生成兩組AItechtalk的logo,甲方爸爸是否能過稿這點無法下定論,但這兩組logo生成合共只需90秒。

  • prompt:a logo for "AI Tech Talk" --v 6.0 --s 100 Variations (Strong)

AItechtalk logo

AItechtalk logo

根據(jù)Midjourney目前的官方指引,文字繪制建議在低stylize下使用,通過目前的測試,設置在--s 100的情況下對文字繪畫出的準確度最高。

當設置更高的stylize進行繪圖,生成的文字準確度明顯下降,會出現(xiàn)拼寫錯誤。

  • prompt:A monitor displaying the text "AItechtalk" in clear, bold font. --v 6.0 --s 1000

在更復雜的構圖中使用低stylize的文字繪畫也能保證準確度。
繪制一間Aitechtalk的概念酒吧。
  • prompt:The facade of "AItechtalk" is a captivating blend of old-world charm and futuristic allure, set in a cozy corner of the street. The exterior is striking, with a sleek, modern sign bearing the bar's name in luminescent letters that glow against the night sky. The entrance is framed by a pair of high-tech, frosted glass doors, etched with subtle, digital patterns that hint at the technological theme inside. As patrons approach, they notice smart panels displaying scrolling text of the latest tech news and snippets of code, engaging the tech-savvy crowd. The windows are tinted but occasionally flicker with the silhouettes of people and the ambient, changing lights from within, suggesting a lively and dynamic environment inside. The overall look is minimalist yet intriguing, inviting those with a curiosity for technology and a love for the night to step into a world where innovation meets relaxation.
繪制一間Aitechtalk的概念咖啡廳。
  • prompt:Picture a stylish and sophisticated café, its large, bold sign reading "AI Tech Talk" making a statement in sleek, modern lettering against the backdrop of an elegantly designed fa?ade. The café exudes a sense of upscale yet welcoming ambiance, attracting a clientele of tech aficionados and casual coffee lovers alike. The exterior is tastefully decorated, perhaps with a combination of natural wood and industrial materials, reflecting a blend of warmth and innovation. Large windows offer a transparent view into the interior, where the lighting is cozy and inviting, casting a soft glow on the chic furniture and decor. The "AI Tech Talk" sign is not just a name but a declaration of the café's theme, possibly featuring an element of technology like a digital display or interactive component that hints at the artificial intelligence focus within. Inside, the café might feature artwork or installations related to technology and AI, creating a stimulating environment for conversation and contemplation. The seating is comfortable and arranged to encourage both intimate gatherings and larger group discussions, with perhaps a special corner or stage area for tech talks, workshops, or presentations. The overall atmosphere is one of refined taste and intellectual curiosity, making "AI Tech Talk" a destination for those who appreciate the finer things in life and have a keen interest in the future of technology.

Aitechtalk的概念咖啡廳

經(jīng)過幾個方面的測試,Midjourney V6在繪畫英文上表現(xiàn)不俗,從簡單到復雜的場景都可渲染出文字細節(jié)。但其他語種包括漢字的文本設計上就無法更好的理解,輸入的是“咒語”畫出的是“咒符”。
  • prompt:Picture a stylish and sophisticated café, its large, bold sign reading "AI科技評論" making a statement in sleek, modern lettering against the backdrop of an elegantly designed fa?ade. The café exudes a sense of upscale yet welcoming ambiance, attracting a clientele of tech aficionados and casual coffee lovers alike. The exterior is tastefully decorated, perhaps with a combination of natural wood and industrial materials, reflecting a blend of warmth and innovation. Large windows offer a transparent view into the interior, where the lighting is cozy and inviting, casting a soft glow on the chic furniture and decor. The "AI科技評論" sign is not just a name but a declaration of the café's theme, possibly featuring an element of technology like a digital display or interactive component that hints at the artificial intelligence focus within. Inside, the café might feature artwork or installations related to technology and AI, creating a stimulating environment for conversation and contemplation. The seating is comfortable and arranged to encourage both intimate gatherings and larger group discussions, with perhaps a special corner or stage area for tech talks, workshops, or presentations. The overall atmosphere is one of refined taste and intellectual curiosity, making "AI Tech Talk" a destination for those who appreciate the finer things in life and have a keen interest in the future of technology.
此外,即使是AI在文字識別方面也不是絕對的公平的,如果你簡單的輸入全球知名品牌,V6是幾乎完美無誤的給出了最佳答卷。
  • prompt:cocacola
當然,這里提到的是“幾乎完美無缺”,如果想找出問題所在,可以拿起手邊的星巴克,對比一下下面這張圖的破綻在哪里。
  • prompt:Starbucks
此外,還有笑不露齒的KFC老爺爺
  • KFC
很難界定在AI的眼中文字和圖像的分界線到底在哪里,越聰明越模糊。

Midjourney v5.2 VS Midjourney v6

v6是Midjourney從零開始訓練而成的第三套模型,對比V5.2,其構圖、色彩光影細節(jié)、以及物理材質的表達都比V5.2更加出色
1、prompt:Yangzhou Fried Rice

揚州炒飯

2、prompt:lady Photo booth

女性大頭貼

3、prompt:chinese lady Photo booth

中國女性大頭貼

4、prompt:Girl at the window.

窗前的女人

5、prompt:Girl with hair blowing in the wind.

長發(fā)飄飄的女人

講述一個故事,讓AI理解你

目前Midjourney V6,可以通過350個詞以上的短文,做更詳細的描述,反饋更接近真實圖像。比如圖中人物的每一件衣著打扮、舉手投足每個動作,假定圖片拍攝的邏輯,構圖中的每一個結構細節(jié)。
簡而言之,如果你擅長講故事,就可以用講故事的方式生成圖,而后用圖來向更多的人展示你講的故事。
1、你可以描述自己正在地震搜救現(xiàn)場休息,拿著自己的手機坐在那隨手一拍,心中祈愿大家平安
  • prompt:I captured a photograph while at the earthquake scene, where I am seated on the rubble. In the lens's frame, only my feet are visible, surrounded by the debris and remnants of the disaster. This perspective offers a personal glimpse into the aftermath, focusing on the point where I am physically connected to the scene, amidst the devastation.

愿地震中的同胞早日恢復正常生活

2、你可以在12月的東北玩雪,拍下的照片是上千次快門才能抓到的快樂瞬間
  • prompt:Envision a photograph you've captured of a snowy landscape in China's Northeast, the scene filled with the serenity and intensity of a heavy snowfall. In the image, your arm is extended towards the camera, your palm open and facing upwards as delicate snowflakes drift down from the grey, cloud-filled sky, landing softly on your skin. Each snowflake is unique, perhaps visible in detail against the contrasting backdrop of your glove or bare hand.Beyond your hand, the scene opens up to a vast expanse of a winter wonderland. Snow blankets everything in sight, covering trees, fields, and structures in a pristine white layer. The snow continues to fall heavily, blurring the lines between sky and land, creating a sense of quiet isolation and beauty. The world seems hushed and still, except for the dance of the snowflakes.

2023年末的南方小土豆都去北方看雪

3、這只是做個夢
  • prompt:Imagine a photograph you've taken, capturing a tender and intimate moment between you and your girlfriend. She is walking ahead of you, her black hair cascading down her back, a symbol of grace and movement. The focus of the image is her back, as she moves forward, perhaps slightly turned to the side, unaware or coyly acknowledging the camera. Your hand extends into the frame, reaching forward to gently grasp her hand, a gesture of connection and affection. The viewer can see only her back and your hand, creating a sense of closeness and companionship. There's a contrast in the image between the movement suggested by her walking and the stillness of the hand-holding moment, capturing the dynamic of your relationship in a single, frozen frame. The background might be softly blurred, emphasizing the focus on the two of you, with the details of the surrounding environment fading into the periphery. The photo tells a story of a shared journey, a moment of tenderness, and a personal connection that speaks louder than words. It's a snapshot of life, love, and the simple, yet profound act of walking together, hand in hand.

快牽住她的手!

4、龍門前關鍵的一腳攔截
  • prompt:A soccer player's first-person perspective as they swiftly approach the goal, maneuver past defenders, and take a powerful shot, sending the ball arching into the top corner of the net.

2023年12月23日英超曼城對布倫特福德延期比賽

5、喜怒哀樂四種情緒中的金發(fā)女性
  • prompt:The image features a 25-year-old woman with golden long hair against a pure white background. She is depicted displaying a spectrum of emotions:(happiness,) (anger)(sorrow)(joy) . Each emotion is vividly portrayed through her expressive facial features and body language, with a particular focus on the subtleties of her eyes and the flow of her hair. The stark white background accentuates her figure and the rich, dynamic expressions that cross her face, capturing the essence of each feeling.

喜怒哀樂四種表情

6、假設自己用攝像機拍下了一個合照,從整個畫面的構圖到人物的年齡到服飾
  • prompt:35mm film still, two-shot of a 50 year old black man with a grey beard wearing a brown jacket and red scarf standing next to a 20 year old white woman wearing a navy blue and cream houndstooth coat and black knit beanie. They are walking down the middle of the street at midnight, illuminated by the soft orange glow of the street lights --ar 7:5 --style raw --v 6.0

35mm的膠片記錄兩人對視一眼

7、在日本秋葉原的街頭
  • prompt:In the image, imagine a person with a stout figure walking through Akihabara, Tokyo's bustling district known for its electronic stores and pop culture. The individual is wearing casual, comfortable clothing, perhaps adorned with vibrant anime graphics, reflective of the area's otaku culture. They might be seen carrying shopping bags filled with gadgets, manga, or anime merchandise, looking content and absorbed in the lively atmosphere of the streets lined with colorful signs and bustling with fellow enthusiasts. The surrounding scenery is a vivid array of neon lights and posters, emblematic of Akihabara's unique vibe.

這是本文作者心目中的自己

當然,Midjourney V6絕對不能說是完美,甚至不能說正式開始逐步代替人工了。拿出實驗過程上千幅案例中,唯一讓我可以描述為“難受”的案例:
  • prompt:In the scene, two adorable children, a boy and a girl, are playing by a riverside, completely immersed in their joyful activity. They are covered in mud from head to toe, a testament to their uninhibited exploration and fun. The boy, with a mischievous glint in his eyes, is in the midst of splashing in a shallow puddle, his laughter echoing in the air. Beside him, the girl, with a bright, carefree smile, is shaping the mud with her small hands, perhaps building castles or imaginary shapes. Their clothing is simple and casual, suited for play, and now adorned with the natural art of the riverside. The background captures the gentle flow of the river and the soft glow of the late afternoon sun, casting a warm, golden light on the scene, highlighting their youthful exuberance and the simple joys of childhood.

細節(jié)清晰,河面倒影細節(jié)清晰,光影拉滿

在初次給出的圖中,大部分細節(jié)都滿足預期,因此直接進行二創(chuàng),期望增加更多的面部細節(jié),并且著重向“Chinese children”進行優(yōu)化,此時出現(xiàn)翻車。

典型的扁平臉,瞇瞇眼

點到為止,AI生成的圖片,畢竟我們還是只探討技術問題。
Midjourney V6此次將AI與人之間的交互感推上了一個新的高度,成品的邏輯性和質量也樹立了新的里程碑,但目前還遠遠未達到可以正式產(chǎn)能投入。

多人場景的復雜場景中容易出現(xiàn)不合邏輯比例

可以看出Midjourney V6大部分讓人滿意的模型優(yōu)化都集中在近景和特寫,對遠景組合、人物表情的細節(jié)以及部分顏色搭配的理解就一言難盡。
除此之外,復雜場景下人物的頭身比例,在到具體的動作,手部交互,握手牽手之間的區(qū)別都不盡人意。此外還有人種之間差異化處理目前也表現(xiàn)的不盡人意。

手部肢體一直是AI模型的硬傷

但值得肯定的是,自Midjourney上一個版本 V5.2發(fā)布半年以來又交出了一份高分的答卷,只不過AI這條路上滿分線仍在幾何級的增長中。

未經(jīng)「AI科技評論」授權,嚴禁以任何方式在網(wǎng)頁、論壇、社區(qū)進行轉載!


公眾號轉載請先在「AI科技評論」后臺留言取得授權,轉載時需標注來源并插入本公眾號名片。

熱門課程推薦

熱門資訊

請綁定手機號

x

同學您好!

您已成功報名0元試學活動,老師會在第一時間與您取得聯(lián)系,請保持電話暢通!
確定