Text-to-image basics with Amazon Nova Canvas

AI image generation has emerged as one of the most transformative technologies in recent years, revolutionizing how you create and interact with visual content. Amazon Nova Canvas is a generative model in the suite of Amazon Nova creative models that enables you to generate realistic and creative images from plain text descriptions.

This post serves as a beginner’s guide to using Amazon Nova Canvas. We begin with the steps to get set up on Amazon Bedrock. Amazon Bedrock is a fully managed service that hosts leading foundation models (FMs) for various use cases such as text, code, and image generation; summarization; question answering; and custom use cases that involve fine-tuning and Retrieval Augmented Generation (RAG). In this post, we focus on the Amazon Nova image generation models available in AWS Regions in the US, in particular, the Amazon Nova Canvas model. We then provide an overview of the image generation process (diffusion) and dive deep into the input parameters for text-to-image generation with Amazon Nova Canvas.

Get started with image generation on Amazon Bedrock

Complete the following steps to get setup with access to Amazon Nova Canvas and the image playground:

AWS account

AWS Identity and Access Management

Regions

US East (N. Virginia)

Model access

Bedrock configurations

What is Model access

Modify model access

Enable specific models

Nova Canvas

Next

Review and submit

Submit

Base models

Access Granted

Image / Video

Playgrounds

Select model

Amazon

Nova Canvas

Apply

You are all set up to start generating images with Amazon Nova Canvas on Amazon Bedrock. The following screenshot shows an example of our playground.

Understanding the generation process

Amazon Nova Canvas uses diffusion-based approaches to generate images:

Starting point

Iterative denoising

Text conditioning

embedding space

Image conditioning

Safety and fairness

Prompting fundamentals

Image generation begins with effective prompting—the art of crafting text descriptions that guide the model toward your desired output. Well-constructed prompts include specific details about subject, style, lighting, perspective, mood, and composition, and work better when structured as image captions rather than a command or conversation. For example, rather than saying “generate an image of a mountain,” a more effective prompt might be “a majestic snow-capped mountain peak at sunset with dramatic lighting and wispy clouds, photorealistic style.” Refer to Amazon Nova Canvas prompting best practices for more information about prompting.

Let’s address the following prompt elements and observe their impact on the final output image:

Subject descriptions (what or who is in the image)

Style references (photography, oil painting, 3D render)

Compositional elements and technical specifications (foreground, background, perspective, lighting)

Positive and negative prompts

Positive prompts tell the model what to include. These are the elements, styles, and characteristics you want to observe in the final image. Avoid the use of negation words like “no,” “not,” or “without” in your prompt. Amazon Nova Canvas has been trained on image-caption pairs, and captions rarely describe what isn’t in an image. Therefore, the model has never learned the concept of negation. Instead, use negative prompts to specify elements to exclude from the output.

Negative prompts specify what to avoid. Common negative prompts include “blurry,” “distorted,” “low quality,” “poor anatomy,” “bad proportions,” “disfigured hands,” or “extra limbs,” which help models avoid typical generation artifacts.

In the following examples, we first use the prompt “An aerial view of an archipelago,” then we refine the prompt as “An aerial view of an archipelago. Negative Prompt: Beaches.”

The balance between positive and negative prompting creates a defined creative space for the model to work within, often resulting in more predictable and desirable outputs.

Image dimensions and aspect ratios

Amazon Nova Canvas is trained on 1:1, portrait and landscape resolutions, with generation tasks having a maximum output resolution of 4.19 million pixels (that is, 2048×2048, 2816×1536). For editing tasks, the image should be 4,096 pixels on its longest side, have an aspect ratio between 1:4 and 4:1, and have a total pixel count of 4.19 million or smaller. Understanding dimensional limitations helps avoid stretched or distorted results, particularly for specialized composition needs.

Classifier-free guidance scale

The classifier-free guidance (CFG) scale controls how strictly the model follows your prompt:

Low values (1.1–3)

Medium values (4–7)

High values (8–10)

In the following examples, we use the prompt “Cherry blossoms, bonsai, Japanese style landscape, high resolution, 8k, lush greens in the background.”

The first image with CFG 2 captures some elements of cherry blossoms and bonsai. The second image with CFG 8 adheres more to the prompt with a potted bonsai, more pronounced cherry blossom flowers, and lush greens in the background.

Think of CFG scale as adjusting how literally your instructions are taken into consideration vs. how much artistic interpretation it applies.

Seed values and reproducibility

Every image generation begins with a randomization seed—essentially a starting number that determines initial conditions:

1234567890

Reproducibility through seed values is essential for professional workflows, allowing refined iterations on the prompt or other input parameters to clearly see their effect, rather than completely random generations. The following images are generated using two slightly different prompts (“A portrait of a girl smiling” vs. “A portrait of a girl laughing”), while holding the seed value and all other parameters constant.

All preceding images in this post have been generated using the text-to-image (TEXT_IMAGE) task type of Amazon Nova Canvas, available through the Amazon Bedrock InvokeModel API. The following is the API request and response structure for image generation:

#Request Structure{    "taskType": "TEXT_IMAGE",    "textToImageParams": {        "text": string,         #Positive Prompt        "negativeText": string  #Negative Prompt    },    "imageGenerationConfig": {        "width": int,           #Image Resolution Width        "height": int,          #Image Resolution Width        "quality": "standard" | "premium",   #Image Quality        "cfgScale": float,      #Classifer Free Guidance Scale        "seed": int,            #Seed value        "numberOfImages": int   #Number of images to be generated (max 5)    }}#Response Structure{    "images": "images": string[], #list of Base64 encoded images    "error": string}

Code example

This solution can also be tested locally with a Python script or a Jupyter notebook. For this post, we use an Amazon SageMaker AI notebook using Python (v3.12). For more information, see Run example Amazon Bedrock API requests using an Amazon SageMaker AI notebook. For instructions to set up your SageMaker notebook instance, refer to Create an Amazon SageMaker notebook instance. Make sure the instance is set up in the same Region where Amazon Nova Canvas access is enabled. For this post, we create a Region variable to match the Region where Amazon Nova Canvas is enabled (us-east-1). You must modify this variable if you’ve enabled the model in a different Region. The following code demonstrates text-to-image generation by invoking the Amazon Nova Canvas v1.0 model using Amazon Bedrock. To understand the API request and response structure for different types of generations, parameters, and more code examples, refer to Generating images with Amazon Nova.

import base64  #For encoding/decoding base64 dataimport io  #For handling byte streamsimport json  #For JSON processingimport boto3  #AWS SDK for Pythonfrom PIL import Image  #Python Imaging Library for image processingfrom botocore.config import Config  #For AWS client configuration#Create a variable to fix the region to where Nova Canvas is enabledregion = "us-east-1"#Setup an Amazon Bedrock runtime clientclient = boto3.client(service_name='bedrock-runtime', region_name=region, config=Config(read_timeout=300))#Set the content type and accept headers for the API callaccept = "application/json"content_type = "application/json"#Define the prompt for image generationprompt = """A cat sitting on a chair, mountains in the background, low angle shot."""#Create the request body with generation parametersapi_request= json.dumps({        "taskType": "TEXT_IMAGE",  #Specify text-to-image generation        "textToImageParams": {            "text": prompt          },        "imageGenerationConfig": {            "numberOfImages": 1,   #Generate one image            "height": 720,        #Image height in pixels            "width": 1280,         #Image width in pixels            "cfgScale": 7.0,       #CFG Scale            "seed": 0              #Seed number for generation        }})#Call the Bedrock model to generate the imageresponse = client.invoke_model(body=api_request, modelId='amazon.nova-canvas-v1:0', accept=accept, contentType=content_type)        #Parse the JSON responseresponse_json = json.loads(response.get("body").read())#Extract the base64-encoded image from the responsebase64_image = response_json.get("images")[0]#Convert the base64 string to ASCII bytesbase64_bytes = base64_image.encode('ascii')#Decode the base64 bytes to get the actual image bytesimage_data = base64.b64decode(base64_bytes)#Convert bytes to an image objectoutput_image = Image.open(io.BytesIO(image_data))#Display the imageoutput_image.show()#Save the image to current working directoryoutput_image.save('output_image.png')

Clean up

When you have finished testing this solution, clean up your resources to prevent AWS charges from being incurred:

Back up the Jupyter notebooks in the SageMaker notebook instance. Shut down and delete the SageMaker notebook instance.

Cost considerations

Consider the following costs from the solution deployed on AWS:

Amazon Bedrock pricing

Amazon SageMaker pricing

Conclusion

This post introduced you to AI image generation, and then provided an overview of accessing image models available on Amazon Bedrock. We then walked through the diffusion process and key parameters with examples using Amazon Nova Canvas. The code template and examples demonstrated in this post aim to get you familiar with the basics of Amazon Nova Canvas and get started with your AI image generation use cases on Amazon Bedrock.

For more details on text-to-image generation and other capabilities of Amazon Nova Canvas, see Generating images with Amazon Nova. Give it a try and let us know your feedback in the comments.

About the Author

Arjun Singh is a Sr. Data Scientist at Amazon, experienced in artificial intelligence, machine learning, and business intelligence. He is a visual person and deeply curious about generative AI technologies in content creation. He collaborates with customers to build ML and AI solutions to achieve their desired outcomes. He graduated with a Master’s in Information Systems from the University of Cincinnati. Outside of work, he enjoys playing tennis, working out, and learning new skills.