<section class="blog-post-content lb-rtxt"><table id="amazon-polly-audio-table"><tbody><tr><td id="amazon-polly-audio-tab"><p></p></td></tr></tbody></table><p>The newest AI models form Meta, <a href="https://aws.amazon.com/bedrock/llama/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">Llama 4 Scout 17B and Llama 4 Maverick 17B</a>, are now available as a fully managed, serverless option in <a href="https://aws.amazon.com/bedrock/">Amazon Bedrock</a>. These new <a href="https://aws.amazon.com/what-is/foundation-models/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">foundation models (FMs)</a> deliver natively multimodal capabilities with early fusion technology that you can use for precise image grounding and extended context processing in your applications.</p><p>Llama 4 uses an innovative mixture-of-experts (MoE) architecture that provides enhanced performance across reasoning and image understanding tasks while optimizing for both cost and speed. This architectural approach enables Llama 4 to offer improved performance at lower cost compared to Llama 3, with expanded language support for global applications.</p><p>The models were already <a href="https://aws.amazon.com/blogs/machine-learning/llama-4-family-of-models-from-meta-are-now-available-in-sagemaker-jumpstart/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">available on Amazon SageMaker JumpStart</a>, and you can now use them in Amazon Bedrock to streamline building and scaling <a href="https://aws.amazon.com/ai/generative-ai/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">generative AI</a> applications with <a href="https://aws.amazon.com/bedrock/security-compliance/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">enterprise-grade security and privacy</a>.</p><p><strong>Llama 4 Maverick 17B</strong> – A natively multimodal model featuring 128 experts and 400 billion total parameters. It excels in image and text understanding, making it suitable for versatile assistant and chat applications. The model supports a 1 million token context window, giving you the flexibility to process lengthy documents and complex inputs.</p><p><strong>Llama 4 Scout 17B</strong> – A general-purpose multimodal model with 16 experts, 17 billion active parameters, and 109 billion total parameters that delivers superior performance compared to all previous Llama models. Amazon Bedrock currently supports a 3.5 million token context window for Llama 4 Scout, with plans to expand in the near future.</p><p><strong>Use cases for Llama 4 models</strong><br />You can use the advanced capabilities of Llama 4 models for a wide range of use cases across industries:</p><p><strong>Enterprise applications</strong> – Build intelligent agents that can reason across tools and workflows, process multimodal inputs, and deliver high-quality responses for business applications.</p><p><strong>Multilingual assistants</strong> – Create chat applications that understand images and provide high-quality responses across multiple languages, making them accessible to global audiences.</p><p><strong>Code and document intelligence</strong> – Develop applications that can understand code, extract structured data from documents, and provide insightful analysis across large volumes of text and code.</p><p><strong>Customer support</strong> – Enhance support systems with image analysis capabilities, enabling more effective problem resolution when customers share screenshots or photos.</p><p><strong>Content creation</strong> – Generate creative content across multiple languages, with the ability to understand and respond to visual inputs.</p><p><strong>Research</strong> – Build research applications that can integrate and analyze multimodal data, providing insights across text and images.</p><p><strong>Using Llama 4 models in Amazon Bedrock<br /></strong> To use these new serverless models in Amazon Bedrock, I first need to request access. In the <a href="https://console.aws.amazon.com/bedrock?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">Amazon Bedrock console</a>, I choose <strong>Model access</strong> from the navigation pane to toggle access to <strong>Llama 4 Maverick 17B</strong> and <strong>Llama 4 Scout 17B</strong> models.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/23/bedrock-llama4-model-access.png"><img class="aligncenter size-full wp-image-95481" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/23/bedrock-llama4-model-access.png" alt="Console screenshot." width="2126" height="726" /></a></p><p>The Llama 4 models can be easily integrated into your applications using the <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/conversation-inference.html?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">Amazon Bedrock Converse API</a>, which provides a unified interface for conversational AI interactions.</p><p>Here’s an example of how to use the <a href="https://aws.amazon.com/sdk-for-python/">AWS SDK for Python (Boto3)</a> with Llama 4 Maverick for a multimodal conversation:</p><pre class="lang-python">import boto3import jsonimport osAWS_REGION = "us-west-2"MODEL_ID = "us.meta.llama4-maverick-17b-instruct-v1:0"IMAGE_PATH = "image.jpg"def get_file_extension(filename: str) -> str: """Get the file extension.""" extension = os.path.splitext(filename)[1].lower()[1:] or 'txt' if extension == 'jpg': extension = 'jpeg' return extensiondef read_file(file_path: str) -> bytes: """Read a file in binary mode.""" try: with open(file_path, 'rb') as file: return file.read() except Exception as e: raise Exception(f"Error reading file {file_path}: {str(e)}")bedrock_runtime = boto3.client( service_name="bedrock-runtime", region_name=AWS_REGION)request_body = { "messages": [ { "role": "user", "content": [ { "text": "What can you tell me about this image?" }, { "image": { "format": get_file_extension(IMAGE_PATH), "source": {"bytes": read_file(IMAGE_PATH)}, } }, ], } ]}response = bedrock_runtime.converse( modelId=MODEL_ID, messages=request_body["messages"])print(response["output"]["message"]["content"][-1]["text"])</pre><p>This example demonstrates how to send both text and image inputs to the model and receive a conversational response. The Converse API abstracts away the complexity of working with different model input formats, providing a consistent interface across models in Amazon Bedrock.</p><p>For more interactive use cases, you can also use the streaming capabilities of the Converse API:</p><pre class="lang-python">response_stream = bedrock_runtime.converse_stream( modelId=MODEL_ID, messages=request_body['messages'])stream = response_stream.get('stream')if stream: for event in stream: if 'messageStart' in event: print(f"\nRole: {event['messageStart']['role']}") if 'contentBlockDelta' in event: print(event['contentBlockDelta']['delta']['text'], end="") if 'messageStop' in event: print(f"\nStop reason: {event['messageStop']['stopReason']}") if 'metadata' in event: metadata = event['metadata'] if 'usage' in metadata: print(f"Usage: {json.dumps(metadata['usage'], indent=4)}") if 'metrics' in metadata: print(f"Metrics: {json.dumps(metadata['metrics'], indent=4)}")</pre><p>With streaming, your applications can provide a more responsive experience by displaying model outputs as they are generated.</p><p><strong>Things to know</strong><br />The Llama 4 models are available today with a fully managed, serverless experience in <a href="https://aws.amazon.com/bedrock/">Amazon Bedrock</a> in the US East (N. Virginia) and US West (Oregon) <a href="https://aws.amazon.com/about-aws/global-infrastructure/regions_az/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">AWS Regions.</a> You can also access Llama 4 in US East (Ohio) via <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference.html?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">cross-region inference</a>.</p><p>As usual with Amazon Bedrock, you pay for what you use. For more information, see <a href="https://aws.amazon.com/bedrock/pricing/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">Amazon Bedrock pricing</a>.</p><p>These models support 12 languages for text (English, French, German, Hindi, Italian, Portuguese, Spanish, Thai, Arabic, Indonesian, Tagalog, and Vietnamese) and English when processing images.</p><p>To start using these new models today, visit the <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters-meta.html?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">Meta Llama models section in the Amazon Bedrock User Guide</a>. You can also explore how our Builder communities are using Amazon Bedrock in their solutions in the generative AI section of our <a href="https://community.aws/?trk=e61dee65-4ce8-4738-84db-75305c9cd4fe&amp;sc_channel=el">community.aws</a> site.</p><p>— <a href="https://x.com/danilop">Danilo</a></p><hr /><p>How is the News Blog doing? Take this <a href="https://amazonmr.au1.qualtrics.com/jfe/form/SV_eyD5tC5xNGCdCmi">1 minute survey</a>!</p><p><em>(This <a href="https://amazonmr.au1.qualtrics.com/jfe/form/SV_eyD5tC5xNGCdCmi">survey</a> is hosted by an external company. AWS handles your information as described in the <a href="https://aws.amazon.com/privacy/?trk=4b29643c-e00f-4ab6-ab9c-b1fb47aa1708&amp;sc_channel=blog">AWS Privacy Notice</a>. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)</em></p></section><aside id="Comments" class="blog-comments"><div data-lb-comp="aws-blog:cosmic-comments" data-env="prod" data-content-id="f0397050-d089-43c0-8851-d2de97f4168c" data-title="Llama 4 models from Meta now available in Amazon Bedrock serverless" data-url="https://aws.amazon.com/blogs/aws/llama-4-models-from-meta-now-available-in-amazon-bedrock-serverless/"><p data-failed-message="Comments cannot be loaded… Please refresh and try again.">Loading comments…</p></div></aside>