未知数据源 2024年09月15日
A developer’s guide to getting started with Imagen 3 on Vertex AI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Google 发布了 Imagen 3,这是一个新的 AI 模型,能够生成高质量的图像,并提供更强的控制和安全功能。Imagen 3 能够生成不同艺术风格和格式的图像,并支持文本渲染,可以根据更详细的提示生成更精准的图像。此外,Imagen 3 还提供了 Imagen 3 Fast 模型,可以更快地生成图像,并支持 SynthID 水印技术,以保护创作并确保负责任的使用。

🎨 **无与伦比的质量和多功能性** Imagen 3 设定了图像生成质量和控制的新标准。它能够生成具有出色构图、清晰度、色彩准确性和分辨率的逼真图像。Imagen 3 支持广泛的艺术风格和格式,从逼真的杰作到奇特的粘土动画场景,提供工具来表达独特的艺术愿景。

💬 **更精准的提示理解** Imagen 3 能够理解更细致的自然语言描述,并生成与描述相符的图像。可以指定相机角度、镜头类型、图像构图等细节,Imagen 3 会根据提示生成更精确的图像,缩小想象与最终图像之间的差距。

🚀 **更快的生成速度** 除了 Imagen 3,Google 还提供了 Imagen 3 Fast 模型,它经过优化,可以快速生成图像。Imagen 3 Fast 适合生成更亮、对比度更高的图像。与 Imagen 2 相比,它的延迟降低了 40%。

🛡️ **保护作品,负责任地创作** Imagen 3 内置了安全措施,可以使用户专注于艺术创作,而不必担心控制问题。与 Google DeepMind 合作,Imagen 3 利用 SynthID 技术,在像素级别嵌入不可见的数字水印。默认情况下,所有 Imagen 3 生成的图像都会添加数字水印,可以使用 add_watermark 参数显式启用此功能。还可以使用 API 验证图像是否使用 Imagen 生成,从而验证 AI 生成图像的真实性,确保作品免遭滥用。

🔐 **高级安全过滤器** Imagen 3 提供高级安全过滤器,可以控制生成的图像类型,确保它们符合品牌价值或原则。可以通过修改 safety_filter_level 来配置生成的图像的安全过滤器阈值。安全级别可以更改为“block_most”、“block_some”或“block_few”。还可以修改 person_generation 设置,以控制生成的图像中人物类型,可以设置为“allow_all”、“allow_adult”或“dont_allow”。

Over the past few months, early users put Imagen 3 on Vertex AI through its paces and shared valuable insights with us. It’s clear that users want an AI model that generates stunning visuals and empowers your practical creative applications. We’ve used their feedback to identify three common themes:

  • Demand for unparalleled quality across diverse artistic styles and formats

  • Desire for strong prompt adherence and fast image generation

  • Controls to protect and build trust with SynthID watermarking and advanced safety filters

Throughout this post, we will walk you through each of these concepts in depth. We will also provide some code examples and best prompt practices so you can get the most out of Imagen 3. 

Uncompromising quality and versatility

Imagen 3 sets a new standard in quality and control over your generated images. This text-to-image model produces photorealistic visuals with exceptional composition, sharpness, color accuracy, and resolution. With Imagen 3, you can explore a wider spectrum of artistic styles and formats. From photorealistic masterpieces to whimsical claymation scenes, the model's expanded range of styles and formats provides the tools to express your unique artistic vision. 

To demonstrate these photorealistic capabilities, let’s walk through an example of creating image mockups for a new cookbook cover. Using the following prompt, the generated image has incredible detail, composition and photorealism.

code_block
<ListValue: [StructValue([('code', 'import vertexai\r\nfrom vertexai.preview.vision_models import ImageGenerationModel\r\n\r\n# TODO(developer): Update and un-comment below lines\r\n# project_id = "PROJECT_ID"\r\n\r\nvertexai.init(project=project_id, location="us-central1")\r\n\r\ngeneration_model = ImageGenerationModel.from_pretrained("imagen-3.0-generate-001")\r\n\r\nprompt = """\r\nA photorealistic image of a cookbook laying on a wooden kitchen table, the cover facing forward featuring a smiling family sitting at a similar table, soft overhead lighting illuminating the scene, the cookbook is the main focus of the image.\r\n"""\r\n\r\nimage = generation_model.generate_images(\r\n prompt=prompt,\r\n number_of_images=1,\r\n aspect_ratio="1:1",\r\n safety_filter_level="block_some",\r\n person_generation="allow_all",\r\n)\r\n\r\n# OPTIONAL: View the generated image in a notebook\r\n# image[0].show()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3eea467db8e0>)])]>

Text rendering 

Imagen 3 also brings new possibilities when it comes to rendering text within images. A fun way to play around with this feature is to generate images of greeting cards, posters, and social media posts with captions in various fonts and colors. This feature is as easy as adding a short text description you would like to see to the prompt. Let’s say you would like to add a title and regenerate a cookbook cover.

code_block
<ListValue: [StructValue([('code', 'prompt = """\r\nA photorealistic image of a cookbook laying on a wooden kitchen table, the cover facing forward featuring a smiling family sitting at a similar table, soft overhead lighting illuminating the scene, the cookbook is the main focus of the image.\r\n\r\nAdd a title to the center of the cookbook cover that reads, "Everyday Recipes" in orange block letters. \r\n"""\r\n\r\nimage = generation_model.generate_images(\r\n prompt=prompt,\r\n number_of_images=1,\r\n aspect_ratio="1:1",\r\n safety_filter_level="block_some",\r\n person_generation="allow_all",\r\n)\r\n\r\n# OPTIONAL: View the generated image in a notebook\r\n# image[0].show()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3eea467db7f0>)])]>

Closer to your intent

Imagen 3's prompt comprehension translates your natural language descriptions, no matter how nuanced, into closely matched visuals. You can specify everything from specific camera angles to types of lenses to image compositions in your description. Imagen 3 adheres closely to the prompt, which helps close the gap between your mental picture and the final image. You can provide the model with simple subject-action-setting prompts or intricate, multi-layered descriptions, and the model adapts to your creative process to enable a broad range of styles.

Since Imagen 3 does well with elaborate prompts, providing robust details usually yields higher quality and more precise results. Below are a few options to consider when crafting your prompts:

  • Arrangement: Direct the scene by specifying where you want subjects positioned.

  • Lighting: Create atmosphere with soft or harsh lighting, and control its direction and focus.

  • Angles & lenses: Add depth and perspective with camera angles and lens choices.

  • Styles: Go beyond photorealism and generate digital art, cinematic, vintage, minimalist images, and more.

Reduced latency

While Imagen 3 is our highest quality model to date, we are also offering Imagen 3 Fast, which is optimized for generation speed. Imagen 3 Fast is suitable for creating brighter, higher contrast images. Compared to Imagen 2, you can see a 40% decrease in latency. To demonstrate these two models, you can generate two images with the same prompt. Let’s generate two options for a photo of a salad to add to the same cookbook from earlier.

code_block
<ListValue: [StructValue([('code', 'generation_model_fast = ImageGenerationModel.from_pretrained(\r\n "imagen-3.0-fast-generate-001"\r\n)\r\n\r\nprompt = """\r\nA photorealistic image of a garden salad overflowing with colorful vegetables like bell peppers, cucumbers, tomatoes, and leafy greens, sitting in a wooden bowl in the center of the image on a white marble table. Natural light illuminates the scene, casting soft shadows and highlighting the freshness of the ingredients. \r\n""" \r\n\r\n# Imagen 3 Fast image generation\r\nfast_image = generation_model_fast.generate_images(\r\n prompt=prompt,\r\n number_of_images=1,\r\n aspect_ratio="1:1",\r\n safety_filter_level="block_some",\r\n person_generation="allow_all",\r\n)\r\n\r\n# OPTIONAL: View the generated image in a notebook\r\n# fast_image[0].show()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3eea467dbc10>)])]>

Image generated by Imagen 3 Fast

code_block
<ListValue: [StructValue([('code', 'prompt = """\r\nA photorealistic image of a garden salad overflowing with colorful vegetables like bell peppers, cucumbers, tomatoes, and leafy greens, sitting in a wooden bowl in the center of the image on a white marble table. Natural light illuminates the scene, casting soft shadows and highlighting the freshness of the ingredients. \r\n""" \r\n\r\n# Imagen 3 image generation\r\nimage = generation_model.generate_images(\r\n prompt=prompt,\r\n number_of_images=1,\r\n aspect_ratio="1:1",\r\n safety_filter_level="block_some",\r\n person_generation="allow_all",\r\n)\r\n\r\n# OPTIONAL: View the generated image in a notebook\r\n# image[0].show()'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3eea467db700>)])]>

Image generated by Imagen 3

Protect your work and create responsibly

Imagen 3 has built in safeguards that let you focus on your artistic vision without compromising control. In partnership with Google DeepMind, Imagen 3 utilizes SynthID, a technology which embeds an invisible watermark at the pixel level. By default, a digital watermark is added to all Imagen 3 generated images, but you can explicitly enable this feature with the add_watermark parameter. You can also use the API to verify whether an image was generated using Imagen. This verifies the authenticity of your AI-generated images, providing transparency and helping to safeguard your work from misuse.

With Imagen 3's advanced safety filters, you can also control the types of images generated to make sure they meet your brand values or principles. To configure safety filter thresholds for generated images, modify the safety_filter_level. The safety level can be changed to “block_most”, “block_some”, or “block_few”. To change the safety setting that controls the type of people generated, modify person_generation to “allow_all”, “allow_adult”, or “dont_allow”.

code_block
<ListValue: [StructValue([('code', '# Imagen 3 image generation\r\nimage = generation_model.generate_images(\r\n prompt=prompt,\r\n number_of_images=1,\r\n aspect_ratio="1:1",\r\n safety_filter_level="block_some",\r\n person_generation="allow_all",\r\n add_watermark=True,\r\n)'), ('language', ''), ('caption', <wagtail.rich_text.RichText object at 0x3eea467dbe50>)])]>

What’s next?

Imagen 3 is now generally available with an allowlist. The developers who've already experienced Imagen 3 are buzzing about its photorealistic capabilities and quality. As one early adopter remarked,

“The precision and realism in capturing the diverse locations and objects of destinations around the world is particularly impressive”, adding that “this level of detail is sure to be a strong competitive edge for Imagen 3.” – Sungmin Han

We’re currently prioritizing access to Imagen 3 on Vertex AI for developers at businesses with well-defined use cases. You can sign up for access through this form. We'll review your application and get back to you as soon as possible. 

In the meantime you can learn more about Imagen 3 and integrate its capabilities in your applications by checking out the resources below!

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Imagen 3 AI 图像生成 深度学习 文本到图像 图像生成模型
相关文章