Google’s Whisk AI generator will ‘remix’ the pictures you plug in

The Verge - Artificial Intelligences 2024年12月17日

Google’s Whisk AI generator will ‘remix’ the pictures you plug in

谷歌发布了新的AI工具Whisk，它允许用户通过提供图像作为提示来生成新的图像，而无需长篇文本描述。用户可以提供多个图像来指定主题、场景和风格，也可以选择使用谷歌提供的AI生成图像。Whisk会生成图像和相应的文本提示，用户可以收藏或下载，并可通过编辑文本提示来优化结果。该工具旨在快速视觉探索，而非像素级编辑，它基于最新的Imagen 3图像生成模型。同时，谷歌还推出了视频生成模型Veo 2，该模型在理解电影语言方面有所提升，并减少了幻觉现象。

🖼️ Whisk 允许用户使用图像作为提示来生成新的图像，无需长篇文本描述，简化了AI图像生成流程。

🎲 用户可以使用多张图像来定义生成图像的主题、场景和风格，也可以使用谷歌提供的AI生成图像作为提示，还支持文本提示进行细节补充。

⚙️ Whisk 生成图像后，会同时生成对应的文本提示，用户可以收藏、下载，或通过编辑文本提示来优化生成结果。

🚀 Whisk 基于最新的Imagen 3图像生成模型，旨在提供快速的视觉探索体验，而非精确的像素级编辑。

🎬 谷歌还推出了视频生成模型Veo 2，该模型在理解电影语言方面有所提升，并减少了幻觉现象。

An AI-generated image I made in Whisk using Google’s suggested images as prompts. | Image: Google via Whisk

Google has announced a new AI tool called Whisk that lets you generate images using other images as prompts instead of requiring a long text prompt.

With Whisk, you can offer images to suggest what you’d like as the subject, the scene, and the style of your AI-generated image, and you can prompt Whisk with multiple images for each of those three things. (If you want, you can fill in text prompts, too.) If you don’t have images on hand, you can click a dice icon to have Google fill in some images for the prompts (though those images also appear to be AI-generated). You can also enter some text into a text box at the end of the process if you want to add extra detail about the image you’re looking for, but it’s not required.

Whisk will then generate images and a text prompt for each image. You can favorite or download the image if you’re happy with the results, or you can refine an image by entering more text into the text box or clicking the image and editing the text prompt.

A screenshot of Whisk. I clicked the dice to generate a subject, scene, and style. I swapped out the auto-generated scene by entering a text prompt. Whisk created the first two images, which I iterated on by asking Whisk to add some steam around the subject (because it’s a fire being in water), resulting in the next two images.

In a blog post, Google stresses that Whisk is designed to be for “rapid visual exploration, not pixel-perfect edits.” The company also says that Whisk may “miss the mark,” which is why it lets you edit the underlying prompts.

In the few minutes I’ve used the tool while writing this story, it’s been entertaining to tinker with. Images take a few seconds to generate, which is annoying, and while the images have been a little strange, everything I’ve generated has been fun to iterate on.

Google says Whisk uses the “latest” iteration of its Imagen 3 image generation model, which it announced today. Google also introduced Veo 2, the next version of its video generation model, which the company says has an understanding of “the unique language of cinematography” and hallucinates things like extra fingers “less frequently” than other models (one of those other models is probably OpenAI’s Sora). Veo 2 is coming first to Google’s VideoFX, which you can get on the Google Labs waitlist for, and it will be expanded to YouTube Shorts “other products” sometime next year.

Fish AI Reader

AI辅助创作，多种专业模板，深度分析，高质量内容生成。从观点提取到深度思考，FishAI为您提供全方位的创作支持。新版本引入自定义参数，让您的创作更加个性化和精准。

FishAI

鱼阅，AI 时代的下一个智能信息助手，助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Whisk AI图像生成 Imagen 3 Veo 2 谷歌AI

相关文章

Google’s Advanced AI Models: Gemini, PaLM, and Bard

谷歌的新型人工智能如何彻底改变医学

Runway: ↩️ Prompt: A school of fish swimming in the air past a mountain. Avant-garde, black and white, 35mm, dynamic movement. (3/9)

双Buff加持！无GPU畅玩ControlNet Union，一个模型搞定10+图像控制

欧盟又出手，谷歌和三星因 AI 合作而遭反垄断调查

当AI走进奥运直播？谷歌Gemini将参与直播巴黎奥运会

Google Research Presents a Novel AI Method for Genetic Discovery that can Harness Hidden Information in High-Dimensional Clinical Data

英特尔发布 AI Playground 开源软件：支持在锐炫 Arc 显卡本地运行 AI 图像生成与聊天机器人

Google AI Introduces NeuralGCM: A New Machine Learning (ML) based Approach to Simulating Earth’s Atmosphere

明势、源码、高榕、金沙江出手，AI图像生成平台「LiblibAI」融资总额达数亿元 | 36氪首发