Gemma 3 27B model now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

We are excited to announce the availability of Gemma 3 27B Instruct models through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, developers and data scientists can now deploy Gemma 3, a 27-billion-parameter language model, along with its specialized instruction-following versions, to help accelerate building, experimentation, and scalable deployment of generative AI solutions on AWS.

In this post, we show you how to get started with Gemma 3 27B Instruct on both Amazon Bedrock Marketplace and SageMaker JumpStart, and how to use the model’s powerful instruction-following capabilities in your applications.

Overview of Gemma 3 27B

Gemma 3 27B is a high-performance, open-weight, multimodal language model by Google designed to handle both text and image inputs with efficiency and contextual understanding. It introduces a redesigned attention architecture, enhanced multilingual support, and extended context capabilities. With its optimized memory usage and support for large input sequences, it is well-suited for complex reasoning tasks, long-form interactions, and vision-language applications. With 27 billion parameters and training on up to 6 trillion tokens of text, these models are optimized for tasks requiring advanced reasoning, multilingual capabilities, and instruction following. According to Google, Gemma3 27B Instruct models are ideal for developers, researchers, and businesses looking to build generative AI applications such as chatbots, virtual assistants, and automated content generation tools. The following are its key features:

Multimodal input

Long context support

Multilingual support

Function calling

Memory-efficient inference

Key use cases for Gemma3, as described by Google, include:

Q&A and summarization

Visual understanding

Multilingual applications

Document processing

Automated workflows

There are two primary methods for deploying Gemma 3 27B in AWS: The first approach involves using Amazon Bedrock Marketplace, which offers a streamlined way of accessing Amazon Bedrock APIs (Invoke and Converse) and tools such as Amazon Bedrock Knowledge Bases, Amazon Bedrock Agents, Amazon Bedrock Flows, Amazon Bedrock Guardrails, and model evaluation. The second approach is using SageMaker JumpStart, a machine learning (ML) hub, with foundation models (FMs), built-in algorithms, and pre-built ML solutions. You can deploy pre-trained models using either the Amazon SageMaker console or SDK.

Deploy Gemma 3 27B Instruct on Amazon Bedrock Marketplace

Amazon Bedrock Marketplace offers access to over 150 specialized FMs, including Gemma 3 27B Instruct.

Prerequisites

To try the Gemma 3 27B Instruct model using Amazon Bedrock Marketplace, you need the following:

An AWS account that will contain all your AWS resources Access to accelerated instances (GPUs) for hosting the large language models (LLMs)

Deploy the model

To deploy the model using Amazon Bedrock Marketplace, complete the following steps:

Foundation models

Model catalog

Gemma

Gemma 3 27B Instruct

Information about Gemma3’s features, costs, and setup instructions can be found on its model overview page. This resource includes integration examples, API documentation, and programming samples. The model excels at a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. You can also access deployment guidelines and license details to begin implementing Gemma3 into your projects.

Deploy

Endpoint name

Number of instances

Although default configurations are typically sufficient for basic needs, you have the option to customize security features such as virtual private cloud (VPC) networking, role-based permissions, and data encryption. These advanced settings might require adjustment for production environments to maintain compliance with your organization’s security protocols.

Prior to deploying Gemma 3, verify that your AWS account has sufficient quota allocation for ml.g5.48xlarge instances. A quota set to 0 will trigger deployment failures, as shown in the following screenshot.

To request a quota increase, open the AWS Service Quotas console and search for SageMaker. Locate ml.g5.48xlarge for endpoint usage and choose Request quota increase, then specify your required limit value.

Managed deployments

Open in playground

You can now use the playground to interact with Gemma 3.

For detailed steps and example code for invoking the model using Amazon Bedrock APIs, refer to Submit prompts and generate response using the API and the following code:

import boto3bedrock_runtime = boto3.client("bedrock-runtime")endpoint_arn = "arn:aws:sagemaker:us-east-2:061519324070:endpoint/endpoint-quick-start-3t7kp"response = bedrock_runtime.converse(    modelId=endpoint_arn,    messages=[        {            "role": "user",            "content": [{"text": "What is Amazon doing in the field of generative AI?"}]        }    ],    inferenceConfig={        "maxTokens": 256,        "temperature": 0.1,        "topP": 0.999    })print(response["output"]["message"]["content"][0]["text"])

Deploy Gemma 3 27B Instruct with SageMaker JumpStart

SageMaker JumpStart offers access to a broad selection of publicly available FMs. These pre-trained models serve as powerful starting points that can be deeply customized to address specific use cases. You can use state-of-the-art model architectures—such as language models, computer vision models, and more—without having to build them from scratch.

With SageMaker JumpStart, you can deploy models in a secure environment. The models can be provisioned on dedicated SageMaker inference instances and can be isolated within your VPC. After deploying an FM, you can further customize and fine-tune it using the extensive capabilities of Amazon SageMaker AI, including SageMaker inference for deploying models and container logs for improved observability. With SageMaker AI, you can streamline the entire model deployment process.

There are two ways to deploy the Gemma 3 model using SageMaker JumpStart:

Through the user-friendly SageMaker JumpStart interface Using the SageMaker Python SDK for programmatic deployment

We examine both deployment methods to help you determine which approach aligns best with your requirements.

Prerequisites

To try the Gemma 3 27B Instruct model in SageMaker JumpStart, you need the following prerequisites:

AWS Identity and Access Management

Identity and Access Management for Amazon SageMaker AI

Amazon SageMaker Studio

Deploy the model through the SageMaker JumpStart UI

SageMaker JumpStart provides a user-friendly interface for deploying pre-built ML models with just a few clicks. Through the SageMaker JumpStart UI, you can select, customize, and deploy a wide range of models for various tasks such as image classification, object detection, and natural language processing, without the need for extensive coding or ML expertise.

Studio

JumpStart

The model browser displays available models, with details like the provider name and model capabilities.

Bedrock Ready

Choose the model card to view the model details page.

The model details page includes the following information:

Deploy

About

Notebooks

About

Before you deploy the model, we recommended you review the model details and license terms to confirm compatibility with your use case.

Deploy

Endpoint name

Instance type

Initial instance count

Selecting appropriate instance types and counts is crucial for cost and performance optimization. Monitor your deployment to adjust these settings as needed. Under Inference type, Real-time inference is selected by default. This is optimized for sustained traffic and low latency.

Deploy

The deployment process can take several minutes to complete.

Deploy the model programmatically using the SageMaker Python SDK

To use Gemma 3 with the SageMaker Python SDK, first make sure you have installed the SDK and set up your AWS permissions and environment correctly. The following is a code example showing how to programmatically deploy and run inference with Gemma 3:

import sagemakerfrom sagemaker.jumpstart.model import JumpStartModelfrom sagemaker import Session, image_urisimport boto3# Initialize SageMaker sessionsession = sagemaker.Session()role = sagemaker.get_execution_role()# Specify model parametersmodel_id = "huggingface-vlm-gemma-3-27b-instruct"  # or "huggingface-llm-gemma-2b" for the smaller versioninstance_type = "ml.g5.48xlarge"  # Choose appropriate instance based on your needs# Create and deploy the modelmodel = JumpStartModel(    model_id=model_id,    role=role,    instance_type=instance_type,    model_version="*",  # Latest version)# Deploy the modelpredictor = model.deploy(    initial_instance_count=1,    accept_eula=True  # Required for deploying foundation models)

Run inference using the SageMaker API

With your Gemma 3 model successfully deployed as a SageMaker endpoint, you’re now ready to start making predictions. The SageMaker SDK provides a straightforward way to interact with your model endpoint for inference tasks. The following code demonstrates how to format your input and make API calls to the endpoint. The code handles both sending requests to the model and processing its responses, making it straightforward to integrate Gemma 3 into your applications.

import jsonimport boto3# Initialize AWS session (ensure your AWS credentials are configured)session = boto3.Session()sagemaker_runtime = session.client("sagemaker-runtime")# Define the SageMaker endpoint name (replace with your deployed endpoint name)endpoint_name = "hf-vlm-gemma-3-27b-instruct-2025-05-07-18-09-16-221"payload = {    "inputs": "What is Amazon doing in the field of generative AI?",    "parameters": {        "max_new_tokens": 256,        "temperature": 0.1,        "top_p": 0.9,        "return_full_text": False    }}# Run inferencetry:    response = sagemaker_runtime.invoke_endpoint(        EndpointName=endpoint_name,        ContentType="application/json",        Body=json.dumps(payload)    )    # Parse the response    result = json.loads(response["Body"].read().decode("utf-8"))    generated_text = result[0]["generated_text"].strip()    print("Generated Response:")    print(generated_text)except Exception as e:    print(f"Error during inference: {e}")

Clean up

To avoid incurring ongoing charges for AWS resources used during exploration of Gemma3 27B Instruct models, it’s important to clean up deployed endpoints and associated resources. Complete the following steps:

Endpoints

Inference

gemma3-27b-instruct-endpoint

Delete

Models

Inference

Delete

Model catalog

Always verify that all endpoints are deleted after experimentation to optimize costs. Refer to the Amazon SageMaker documentation for additional guidance on managing resources.

Conclusion

The availability of Gemma3 27B Instruct models in Amazon Bedrock Marketplace and SageMaker JumpStart empowers developers, researchers, and businesses to build cutting-edge generative AI applications with ease. With their high performance, multilingual capabilities and efficient deployment on AWS infrastructure, these models are well-suited for a wide range of use cases, from conversational AI to code generation and content automation. By using the seamless discovery and deployment capabilities of SageMaker JumpStart and Amazon Bedrock Marketplace, you can accelerate your AI innovation while benefiting from the secure, scalable, and cost-effective AWS Cloud infrastructure.

We encourage you to explore the Gemma3 27B Instruct models today by visiting the SageMaker JumpStart console or Amazon Bedrock Marketplace. Deploy the model and experiment with sample prompts to meet your specific needs. For further learning, explore the AWS Machine Learning Blog, the SageMaker JumpStart GitHub repository, and the Amazon Bedrock documentation. Start building your next generative AI solution with Gemma3 27B Instruct models and unlock new possibilities with AWS!

About the Authors

Santosh Vallurupalli is a Sr. Solutions Architect at AWS. Santosh specializes in networking, containers, and migrations, and enjoys helping customers in their journey of cloud adoption and building cloud-based solutions for challenging issues. In his spare time, he likes traveling, watching Formula1, and watching The Office on repeat.

Aravind Singirikonda is an AI/ML Solutions Architect at AWS. He works with AWS customers in the healthcare and life sciences domain to provide guidance and technical assistance, helping them improve the value of their AI/ML solutions when using AWS.

Pawan Matta is a Sr. Solutions Architect at AWS. He works with AWS customers in the gaming industry and guides them to deploy highly scalable, performant architectures. His area of focus is management and governance. In his free time, he likes to play FIFA and watch cricket.

Ajit Mahareddy is an experienced Product and Go-To-Market (GTM) leader with over 20 years of experience in product management, engineering, and GTM. Prior to his current role, Ajit led product management building AI/ML products at leading technology companies, including Uber, Turing, and eHealth. He is passionate about advancing generative AI technologies and driving real-world impact with generative AI.

Overview of Gemma 3 27B

Deploy Gemma 3 27B Instruct on Amazon Bedrock Marketplace

Prerequisites

Deploy the model

Deploy Gemma 3 27B Instruct with SageMaker JumpStart

Prerequisites

Deploy the model through the SageMaker JumpStart UI

Deploy the model programmatically using the SageMaker Python SDK

Run inference using the SageMaker API

Clean up

Conclusion

About the Authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签