AWS Machine Learning Blog 01月28日
Image and video prompt engineering for Amazon Nova Canvas and Amazon Nova Reel
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊在Amazon Bedrock上推出图像生成模型Nova Canvas和视频创作模型Nova Reel,可将文本和图像输入转化为定制视觉内容,本文还分享了相关的提示工程技巧及应用案例。

🎨亚马逊推出Nova Canvas和Nova Reel,用于图像和视频创作

📝探讨如何为图像和视频生成模型撰写有效提示

🌄详细介绍好提示的要素,如主体、动作、环境等

🎬给出视频生成提示的要求和最佳实践

Amazon has introduced two new creative content generation models on Amazon Bedrock: Amazon Nova Canvas for image generation and Amazon Nova Reel for video creation. These models transform text and image inputs into custom visuals, opening up creative opportunities for both professional and personal projects. Nova Canvas, a state-of-the-art image generation model, creates professional-grade images from text and image inputs, ideal for applications in advertising, marketing, and entertainment. Similarly, Nova Reel, a cutting-edge video generation model, supports the creation of short videos from text and image inputs, complete with camera motion controls using natural language. Although these models are powerful tools for creative expression, their effectiveness relies heavily on how well users can communicate their vision through prompts.

This post dives deep into prompt engineering for both Nova Canvas and Nova Reel. We share proven techniques for describing subjects, environments, lighting, and artistic style in ways the models can understand. For Nova Reel, we explore how to effectively convey camera movements and transitions through natural language. Whether you’re creating marketing content, prototyping designs, or exploring creative ideas, these prompting strategies can help you unlock the full potential of the visual generation capabilities of Amazon Nova.

Solution overview

To get started with Nova Canvas and Nova Reel, you can either use the Image/Video Playground on the Amazon Bedrock console or access the models through APIs. For detailed setup instructions, including account requirements, model access, and necessary permissions, refer to Creative content generation with Amazon Nova.

Generate images with Nova Canvas

Prompting for image generation models like Amazon Nova Canvas is both an art and a science. Unlike large language models (LLMs), Nova Canvas doesn’t reason or interpret command-based instructions explicitly. Instead, it transforms prompts into images based on how well the prompt captures the essence of the desired image. Writing effective prompts is key to unlocking the full potential of the model for various use cases. In this post, we explore how to craft effective prompts, examine essential elements of good prompts, and dive into typical use cases, each with compelling example prompts.

Essential elements of good prompts

A good prompt serves as a descriptive image caption rather than command-based instructions. It should provide enough detail to clearly describe the desired outcome while maintaining brevity (limited to 1,024 characters). Instead of giving command-based instructions like “Make a beautiful sunset,” you can achieve better results by describing the scene as if you’re looking at it: “A vibrant sunset over mountains, with golden rays streaming through pink clouds, captured from a low angle.” Think of it as painting a vivid picture with words to guide the model effectively.

Effective prompts start by clearly defining the main subject and action:

Add further context:

After you define the main focus of the image, you can refine the prompt further by specifying additional attributes such as visual style, framing, lighting, and technical parameters. For instance:

Finally, you can use a negative prompt to exclude specific elements or artifacts from your composition, aligning it more closely to your vision:

Let’s bring all these elements together with an example prompt with a basic and enhanced version that shows these techniques in action.

Basic: A car in front of a house Enhanced: A blue luxury sports car parked in front of a grand villa overlooking Lake Como. The setting features meticulously manicured gardens, with the lake’s sparkling waters and distant mountains in the background. The car’s sleek, polished surface reflects the surrounding elegance, enhanced by soft, diffused lighting from a cloudy sky. A wide-angle shot capturing the car, villa, and lake in harmony, rendered in a cinematic style with vivid, high-contrast details.

Example image generation prompts

By mastering these techniques, you can create stunning visuals for a wide range of applications using Amazon Nova Canvas. The following images have been generated using Nova Canvas at a 1280×720 pixel resolution with a CFG scale of 6.5 and seed of 0 for reproducibility (see Request and response structure for image generation for detailed explanations of these parameters). This resolution also matches the image dimensions expected by Nova Reel, allowing seamless integration for video experimentation. Let’s examine some example prompts and their resulting images to see these techniques in action.

Aerial view of sparse arctic tundra landscape, expansive white terrain with meandering frozen rivers and scattered rock formations. High-contrast black and white composition showcasing intricate patterns of ice and snow, emphasizing texture and geological diversity. Bird’s-eye perspective capturing the abstract beauty of the arctic wilderness. An overhead shot of premium over-ear headphones resting on a reflective surface, showcasing the symmetry of the design. Dramatic side lighting accentuates the curves and edges, casting subtle shadows that highlight the product’s premium build quality
An angled view of a premium matte metal water bottle with bamboo accents, showcasing its sleek profile. The background features a soft blur of a serene mountain lake. Golden hour sunlight casts a warm glow on the bottle’s surface, highlighting its texture. Captured with a shallow depth of field for product emphasis. Watercolor scene of a cute baby dragon with pearlescent mint-green scales crouched at the edge of a garden puddle, tiny wings raised. Soft pastel flowers and foliage frame the composition. Loose, wet-on-wet technique for a dreamy atmosphere, with sunlight glinting off ripples in the puddle.
Abstract figures emerging from digital screens, glitch art aesthetic with RGB color shifts, fragmented pixel clusters, high contrast scanlines, deep shadows cast by volumetric lighting An intimate portrait of a seasoned fisherman, his face filling the frame. His thick gray beard is flecked with sea spray, and his knit cap is pulled low over his brow. The warm glow of sunset bathes his weathered features in golden light, softening the lines of his face while still preserving the character earned through years at sea. His eyes reflect the calm waters of the harbor behind him.

Generate video with Nova Reel

Video generation models work best with descriptive prompts rather than command-based instructions. When crafting prompts, focus on what you want to see rather than telling the model what to do. Think of writing a detailed caption or scene description like you’re explaining a video that already exists. For example, describing elements like the main subjects, what they’re doing, the setting around them, how the scene is lit, the overall artistic style, and any camera movements can help create more accurate results. The key is to paint a complete picture through description rather than giving step-by-step directions. This means instead of saying “create a dramatic scene,” you could describe “a stormy beach at sunset with crashing waves and dark clouds, filmed with a slow aerial shot moving over the coastline.” The more specific and descriptive you can be about the visual elements you want, the better the output will be. You might want to include details about the subject, action, environment, lighting, style, and camera motion.

When writing a video generation prompt for Nova Reel, be mindful of the following requirements and best practices:

When describing camera movements in your video prompts, be specific about the type of motion you want—whether it’s a smooth dolly shot (moving forward/backward), a pan (sweeping left/right), or a tilt (moving up/down). For more dynamic effects, you can request aerial shots, orbit movements, or specialized techniques like dolly zooms. You can also specify the speed of the movement; refer to camera controls for more ideas.

Example video generation prompts

Let’s look at some of the outputs from Amazon Nova Reel.

Track right shot of single red balloon floating through empty subway tunnel. Balloon glows from within, casting soft red light on concrete walls. Cinematic 4k, moody lighting Dolly in shot of peaceful deer drinking from forest stream. Sunlight filtering through and bokeh of other deer and forest plants. 4k cinematic.
Pedestal down in a pan in modern kitchen penne pasta with heavy cream white sauce on top, mushrooms and garlic, steam coming out. Orbit shot of crystal light bulb centered on polished marble surface, gentle floating gears spinning inside with golden glow. Premium lighting. 4k cinematic.

Generate image-based video using Nova Reel

In addition to the fundamental text-to-video, Amazon Nova Reel also supports an image-to-video feature that allows you to use an input reference image to guide video generation. Using image-based prompts for video generation offers significant advantages: it speeds up your creative iteration process and provides precise control over the final output. Instead of relying solely on text descriptions, you can visually define exactly how you want your video to start. These images can come from Nova Canvas or from other sources where you have appropriate rights to use them. This approach has two main strategies:

This method streamlines your creative workflow by using your image as the visual foundation. Rather than spending time refining text prompts to achieve the right look, you can quickly create an image in Nova Canvas (or source it elsewhere) and use it as your starting point in Nova Reel. This approach leads to faster iterations and more predictable results compared to pure text-to-video generation.

Let’s take some images we created earlier with Amazon Nova Canvas and use them as reference frames to generate videos with Amazon Nova Reel.

Slow dolly forward A 3d animated film: A mint-green baby dragon is talking dramatically. Emotive, high quality animation.
Track right of a premium matte metal water bottle with bamboo accents with background serene mountain lake ripples moving Orbit shot of premium over-ear headphones on a reflective surface. Dramatic side lighting accentuates the curves and edges, casting subtle shadows that highlight the product’s premium build quality.

Tying it all together

You can transform your storytelling by combining multiple generated video clips into a captivating narrative. Although Nova Reel handles the video generation, you can further enhance the content using preferred video editing software to add thoughtful transitions, background music, and narration. This combination creates an immersive journey that transports viewers into compelling storytelling. The following example showcases Nova Reel’s capabilities in creating engaging visual narratives. Each clip is crafted with professional lighting and cinematic quality.

Best practices

Consider the following best practices:

Fundamentals of prompt engineering for Amazon Nova

Effective prompt engineering for Amazon Nova Canvas and Nova Reel follows an iterative process, designed to refine your input and achieve the desired output. This process can be visualized as a logical flow chart, as illustrated in the following diagram.

The process consists of the following steps:

    Begin by crafting your initial prompt, focusing on descriptive elements such as subject, action, environment, lighting, style, and camera position or motion. After generating the first output, assess whether it aligns with your vision. If the result is close to what you want, proceed to the next step. If not, return to Step 1 and refine your prompt. When you’ve found a promising direction, maintain the same seed value. This maintains consistency in subsequent generations while you make minor adjustments. Make small, targeted changes to your prompt. These could involve adjusting descriptors, adding or removing elements, and more. Generate a new output with your refined prompt, keeping the seed value constant. Evaluate the new output. If it’s satisfactory, move to the next step. If not, return to Step 4 and continue refining. When you’ve achieved a satisfactory result, experiment with different seed values to create variations of your successful prompt. This allows you to explore subtle differences while maintaining the core elements of your desired output. Select the best variations from your generated options as your final output.

This iterative approach allows for systematic improvement of your prompts, leading to more accurate and visually appealing results from Nova Canvas and Nova Reel models. Remember, the key to success lies in making incremental changes, maintaining consistency where needed, and being open to exploring variations after you’ve found a winning formula.

Conclusion

Understanding effective prompt engineering for Nova Canvas and Nova Reel unlocks exciting possibilities for creating stunning images and compelling videos. By following the best practices outlined in this guide—from crafting descriptive prompts to iterative refinement—you can transform your ideas into production-ready visual assets.

Ready to start creating? Visit the Amazon Bedrock console today to experiment with Nova Canvas and Nova Reel in the Amazon Bedrock Playground or using the APIs. For detailed specifications, supported features, and additional examples, refer to the following resources:


About the authors

Yanyan Zhang is a Senior Generative AI Data Scientist at Amazon Web Services, where she has been working on cutting-edge AI/ML technologies as a Generative AI Specialist, helping customers use generative AI to achieve their desired outcomes. Yanyan graduated from Texas A&M University with a PhD in Electrical Engineering. Outside of work, she loves traveling, working out, and exploring new things.

Kris Schultz has spent over 25 years bringing engaging user experiences to life by combining emerging technologies with world class design. In his role as Senior Product Manager, Kris helps design and build AWS services to power Media & Entertainment, Gaming, and Spatial Computing.

Sanju Sunny is a Digital Innovation Specialist with Amazon ProServe. He engages with customers in a variety of industries around Amazon’s distinctive customer-obsessed innovation mechanisms in order to rapidly conceive, validate and prototype new products, services and experiences.

Nitin Eusebius is a Sr. Enterprise Solutions Architect at AWS, experienced in Software Engineering, Enterprise Architecture, and AI/ML. He is deeply passionate about exploring the possibilities of generative AI. He collaborates with customers to help them build well-architected applications on the AWS platform, and is dedicated to solving technology challenges and assisting with their cloud journey.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

亚马逊 创意内容 Nova Canvas Nova Reel 提示工程
相关文章