AWS Machine Learning Blog 05月29日 02:30
Gemma 3 27B model now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了如何在Amazon Bedrock Marketplace和SageMaker JumpStart上部署和使用Google的Gemma 3 27B Instruct模型。Gemma 3 27B是一款高性能、开源的多模态语言模型,具有处理文本、图像和短视频的能力,支持长上下文和多种语言。通过Amazon Bedrock Marketplace,用户可以快速访问和部署该模型,利用其强大的指令跟随能力。而SageMaker JumpStart则提供了更灵活的部署和定制选项,用户可以通过UI界面或SageMaker Python SDK进行部署,并利用SageMaker AI的各种功能进行微调和优化。

🔑Gemma 3 27B 是由 Google 开发的一款高性能、开源、多模态语言模型,参数量达 270 亿,支持文本和图像输入,具备高效的上下文理解能力,适用于复杂的推理任务、长篇对话和视觉语言应用。

🌐Gemma 3 27B 具有强大的多语言支持能力,开箱即用地支持超过 35 种语言,且在预训练中接触了超过 140 种语言,这使得它能够应用于构建多语言 AI 助手和工具。

🚀用户可以通过 Amazon Bedrock Marketplace 或 SageMaker JumpStart 在 AWS 上部署 Gemma 3 27B Instruct 模型。Amazon Bedrock Marketplace 提供简化的访问方式,而 SageMaker JumpStart 提供更灵活的定制和部署选项。

🔧在 Amazon Bedrock Marketplace 上部署 Gemma 3 27B Instruct 模型时,需要具备 AWS 账户和访问 GPU 加速实例的权限,部署过程中可以自定义安全特性,如 VPC 网络、基于角色的权限和数据加密。

💻通过 SageMaker JumpStart 部署 Gemma 3 模型,可以使用用户友好的 SageMaker JumpStart 界面或 SageMaker Python SDK 进行编程部署,并可利用 SageMaker AI 的各种功能进行微调和优化。

We are excited to announce the availability of Gemma 3 27B Instruct models through Amazon Bedrock Marketplace and Amazon SageMaker JumpStart. With this launch, developers and data scientists can now deploy Gemma 3, a 27-billion-parameter language model, along with its specialized instruction-following versions, to help accelerate building, experimentation, and scalable deployment of generative AI solutions on AWS.

In this post, we show you how to get started with Gemma 3 27B Instruct on both Amazon Bedrock Marketplace and SageMaker JumpStart, and how to use the model’s powerful instruction-following capabilities in your applications.

Overview of Gemma 3 27B

Gemma 3 27B is a high-performance, open-weight, multimodal language model by Google designed to handle both text and image inputs with efficiency and contextual understanding. It introduces a redesigned attention architecture, enhanced multilingual support, and extended context capabilities. With its optimized memory usage and support for large input sequences, it is well-suited for complex reasoning tasks, long-form interactions, and vision-language applications. With 27 billion parameters and training on up to 6 trillion tokens of text, these models are optimized for tasks requiring advanced reasoning, multilingual capabilities, and instruction following. According to Google, Gemma3 27B Instruct models are ideal for developers, researchers, and businesses looking to build generative AI applications such as chatbots, virtual assistants, and automated content generation tools. The following are its key features:

Key use cases for Gemma3, as described by Google, include:

There are two primary methods for deploying Gemma 3 27B in AWS: The first approach involves using Amazon Bedrock Marketplace, which offers a streamlined way of accessing Amazon Bedrock APIs (Invoke and Converse) and tools such as Amazon Bedrock Knowledge Bases, Amazon Bedrock Agents, Amazon Bedrock Flows, Amazon Bedrock Guardrails, and model evaluation. The second approach is using SageMaker JumpStart, a machine learning (ML) hub, with foundation models (FMs), built-in algorithms, and pre-built ML solutions. You can deploy pre-trained models using either the Amazon SageMaker console or SDK.

Deploy Gemma 3 27B Instruct on Amazon Bedrock Marketplace

Amazon Bedrock Marketplace offers access to over 150 specialized FMs, including Gemma 3 27B Instruct.

Prerequisites

To try the Gemma 3 27B Instruct model using Amazon Bedrock Marketplace, you need the following:

Deploy the model

To deploy the model using Amazon Bedrock Marketplace, complete the following steps:

    On the Amazon Bedrock console, under Foundation models in the navigation pane, select Model catalog. Filter for Gemma as the provider and choose Gemma 3 27B Instruct.

Information about Gemma3’s features, costs, and setup instructions can be found on its model overview page. This resource includes integration examples, API documentation, and programming samples. The model excels at a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. You can also access deployment guidelines and license details to begin implementing Gemma3 into your projects.

    Review the model details, pricing, and deployment guidelines, and choose Deploy to start the deployment process.

    For Endpoint name, enter an endpoint name (between 1–50 alphanumeric characters) or leave it as the default name that is pre-populated. For Number of instances, enter a number of instances (between 1–100). Select your preferred instance type, with GPU-powered options like ml.g5.48xlarge being particularly well-suited for running Gemma 3 efficiently.

Although default configurations are typically sufficient for basic needs, you have the option to customize security features such as virtual private cloud (VPC) networking, role-based permissions, and data encryption. These advanced settings might require adjustment for production environments to maintain compliance with your organization’s security protocols.

Prior to deploying Gemma 3, verify that your AWS account has sufficient quota allocation for ml.g5.48xlarge instances. A quota set to 0 will trigger deployment failures, as shown in the following screenshot.

To request a quota increase, open the AWS Service Quotas console and search for SageMaker. Locate ml.g5.48xlarge for endpoint usage and choose Request quota increase, then specify your required limit value.

    While the deployment is in progress, you can choose Managed deployments in the navigation pane to monitor the deployment status. When deployment is complete, you can test Gemma 3’s capabilities directly in the Amazon Bedrock playground by selecting the managed deployment and choosing Open in playground.

You can now use the playground to interact with Gemma 3.

For detailed steps and example code for invoking the model using Amazon Bedrock APIs, refer to Submit prompts and generate response using the API and the following code:

import boto3bedrock_runtime = boto3.client("bedrock-runtime")endpoint_arn = "arn:aws:sagemaker:us-east-2:061519324070:endpoint/endpoint-quick-start-3t7kp"response = bedrock_runtime.converse(    modelId=endpoint_arn,    messages=[        {            "role": "user",            "content": [{"text": "What is Amazon doing in the field of generative AI?"}]        }    ],    inferenceConfig={        "maxTokens": 256,        "temperature": 0.1,        "topP": 0.999    })print(response["output"]["message"]["content"][0]["text"])

Deploy Gemma 3 27B Instruct with SageMaker JumpStart

SageMaker JumpStart offers access to a broad selection of publicly available FMs. These pre-trained models serve as powerful starting points that can be deeply customized to address specific use cases. You can use state-of-the-art model architectures—such as language models, computer vision models, and more—without having to build them from scratch.

With SageMaker JumpStart, you can deploy models in a secure environment. The models can be provisioned on dedicated SageMaker inference instances and can be isolated within your VPC. After deploying an FM, you can further customize and fine-tune it using the extensive capabilities of Amazon SageMaker AI, including SageMaker inference for deploying models and container logs for improved observability. With SageMaker AI, you can streamline the entire model deployment process.

There are two ways to deploy the Gemma 3 model using SageMaker JumpStart:

We examine both deployment methods to help you determine which approach aligns best with your requirements.

Prerequisites

To try the Gemma 3 27B Instruct model in SageMaker JumpStart, you need the following prerequisites:

Deploy the model through the SageMaker JumpStart UI

SageMaker JumpStart provides a user-friendly interface for deploying pre-built ML models with just a few clicks. Through the SageMaker JumpStart UI, you can select, customize, and deploy a wide range of models for various tasks such as image classification, object detection, and natural language processing, without the need for extensive coding or ML expertise.

    On the SageMaker AI console, choose Studio in the navigation pane. First-time users will be prompted to create a domain. On the SageMaker Studio console, choose JumpStart in the navigation pane.

The model browser displays available models, with details like the provider name and model capabilities.

    Search for Gemma 3 to view the Gemma 3 model card. Each model card shows key information, including:
      Model name Provider name Task category (for example, Text Generation) The Bedrock Ready badge (if applicable), indicating that this model can be registered with Amazon Bedrock, so you can use Amazon Bedrock APIs to invoke the model

    Choose the model card to view the model details page.

The model details page includes the following information:

Before you deploy the model, we recommended you review the model details and license terms to confirm compatibility with your use case.

    Choose Deploy to proceed with deployment. For Endpoint name, enter an endpoint name (between 1–50 alphanumeric characters) or leave it as default. For Instance type, choose an instance type (default: ml.g5.48xlarge). For Initial instance count, enter the number of instances (default: 1).

Selecting appropriate instance types and counts is crucial for cost and performance optimization. Monitor your deployment to adjust these settings as needed. Under Inference type, Real-time inference is selected by default. This is optimized for sustained traffic and low latency.

    Review all configurations for accuracy. For this model, we strongly recommend adhering to SageMaker JumpStart default settings and making sure that network isolation remains in place. Choose Deploy to deploy the model.

The deployment process can take several minutes to complete.

Deploy the model programmatically using the SageMaker Python SDK

To use Gemma 3 with the SageMaker Python SDK, first make sure you have installed the SDK and set up your AWS permissions and environment correctly. The following is a code example showing how to programmatically deploy and run inference with Gemma 3:

import sagemakerfrom sagemaker.jumpstart.model import JumpStartModelfrom sagemaker import Session, image_urisimport boto3# Initialize SageMaker sessionsession = sagemaker.Session()role = sagemaker.get_execution_role()# Specify model parametersmodel_id = "huggingface-vlm-gemma-3-27b-instruct"  # or "huggingface-llm-gemma-2b" for the smaller versioninstance_type = "ml.g5.48xlarge"  # Choose appropriate instance based on your needs# Create and deploy the modelmodel = JumpStartModel(    model_id=model_id,    role=role,    instance_type=instance_type,    model_version="*",  # Latest version)# Deploy the modelpredictor = model.deploy(    initial_instance_count=1,    accept_eula=True  # Required for deploying foundation models)

Run inference using the SageMaker API

With your Gemma 3 model successfully deployed as a SageMaker endpoint, you’re now ready to start making predictions. The SageMaker SDK provides a straightforward way to interact with your model endpoint for inference tasks. The following code demonstrates how to format your input and make API calls to the endpoint. The code handles both sending requests to the model and processing its responses, making it straightforward to integrate Gemma 3 into your applications.

import jsonimport boto3# Initialize AWS session (ensure your AWS credentials are configured)session = boto3.Session()sagemaker_runtime = session.client("sagemaker-runtime")# Define the SageMaker endpoint name (replace with your deployed endpoint name)endpoint_name = "hf-vlm-gemma-3-27b-instruct-2025-05-07-18-09-16-221"payload = {    "inputs": "What is Amazon doing in the field of generative AI?",    "parameters": {        "max_new_tokens": 256,        "temperature": 0.1,        "top_p": 0.9,        "return_full_text": False    }}# Run inferencetry:    response = sagemaker_runtime.invoke_endpoint(        EndpointName=endpoint_name,        ContentType="application/json",        Body=json.dumps(payload)    )    # Parse the response    result = json.loads(response["Body"].read().decode("utf-8"))    generated_text = result[0]["generated_text"].strip()    print("Generated Response:")    print(generated_text)except Exception as e:    print(f"Error during inference: {e}")

Clean up

To avoid incurring ongoing charges for AWS resources used during exploration of Gemma3 27B Instruct models, it’s important to clean up deployed endpoints and associated resources. Complete the following steps:

    Delete SageMaker endpoints:
      On the SageMaker console, in the navigation pane, choose Endpoints under Inference. Select the endpoint associated with the Gemma3 27B Instruct model (for example, gemma3-27b-instruct-endpoint). Choose Delete and confirm the deletion. This stops the endpoint and prevents further compute charges.
    Delete SageMaker models (if applicable): On the SageMaker console, choose Models under Inference. Select the model associated with your endpoint and choose Delete. Verify Amazon Bedrock Marketplace resources: On the Amazon Bedrock console, choose Model catalog in the navigation pane. Make sure no additional endpoints are running for the Gemma3 27B Instruct model deployed through Amazon Bedrock Marketplace.

Always verify that all endpoints are deleted after experimentation to optimize costs. Refer to the Amazon SageMaker documentation for additional guidance on managing resources.

Conclusion

The availability of Gemma3 27B Instruct models in Amazon Bedrock Marketplace and SageMaker JumpStart empowers developers, researchers, and businesses to build cutting-edge generative AI applications with ease. With their high performance, multilingual capabilities and efficient deployment on AWS infrastructure, these models are well-suited for a wide range of use cases, from conversational AI to code generation and content automation. By using the seamless discovery and deployment capabilities of SageMaker JumpStart and Amazon Bedrock Marketplace, you can accelerate your AI innovation while benefiting from the secure, scalable, and cost-effective AWS Cloud infrastructure.

We encourage you to explore the Gemma3 27B Instruct models today by visiting the SageMaker JumpStart console or Amazon Bedrock Marketplace. Deploy the model and experiment with sample prompts to meet your specific needs. For further learning, explore the AWS Machine Learning Blog, the SageMaker JumpStart GitHub repository, and the Amazon Bedrock documentation. Start building your next generative AI solution with Gemma3 27B Instruct models and unlock new possibilities with AWS!


About the Authors

Santosh Vallurupalli is a Sr. Solutions Architect at AWS. Santosh specializes in networking, containers, and migrations, and enjoys helping customers in their journey of cloud adoption and building cloud-based solutions for challenging issues. In his spare time, he likes traveling, watching Formula1, and watching The Office on repeat.

Aravind Singirikonda is an AI/ML Solutions Architect at AWS. He works with AWS customers in the healthcare and life sciences domain to provide guidance and technical assistance, helping them improve the value of their AI/ML solutions when using AWS.

Pawan Matta is a Sr. Solutions Architect at AWS. He works with AWS customers in the gaming industry and guides them to deploy highly scalable, performant architectures. His area of focus is management and governance. In his free time, he likes to play FIFA and watch cricket.

Ajit Mahareddy is an experienced Product and Go-To-Market (GTM) leader with over 20 years of experience in product management, engineering, and GTM. Prior to his current role, Ajit led product management building AI/ML products at leading technology companies, including Uber, Turing, and eHealth. He is passionate about advancing generative AI technologies and driving real-world impact with generative AI.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Gemma 3 27B Amazon Bedrock SageMaker JumpStart 多模态模型 AI部署
相关文章