AWS Machine Learning Blog 前天 07:36
Mistral-Small-3.2-24B-Instruct-2506 is now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Mistral-Small-3.2-24B-Instruct-2506,一个240亿参数的大语言模型,现已通过Amazon SageMaker JumpStart和Amazon Bedrock Marketplace提供给客户。该模型在指令遵循和减少重复性输出方面进行了优化,并新增了图像-文本-文本能力,可处理文本和视觉输入。文章详细介绍了如何在SageMaker JumpStart和Amazon Bedrock Marketplace上发现、部署和使用该模型,并提供了详细的代码示例,展示了其在多模态推理和函数调用方面的实际应用,特别是在分析GDP数据图表和教育投资回报率图表方面。

🚀 **Mistral-Small-3.2-24B-Instruct-2506模型概述**:该模型是Mistral-Small-3.1-24B-Instruct-2503的更新版本,保持240亿参数架构,但在指令遵循准确率(84.78% vs 82.75%)和减少无限生成/重复回答(1.29% vs 2.11%)方面有显著提升。它还增加了图像-文本-文本能力,支持文档理解、视觉问答和图像内容生成,并拥有128,000个token的上下文窗口,非常适合企业级应用。

🌟 **在Amazon SageMaker JumpStart和Bedrock Marketplace上的部署**:用户可以通过Amazon Bedrock Marketplace发现、测试和使用Mistral-Small-3.2-24B-Instruct-2506。同样,SageMaker JumpStart也提供该模型,用户可以通过SageMaker Studio UI或SageMaker Python SDK进行快速部署和定制。文章详细列出了部署的先决条件(如AWS账户、IAM角色、GPU实例)和具体操作步骤。

📊 **多模态能力的应用实例**:文章展示了该模型如何利用其视觉推理能力分析复杂的图表数据。例如,通过分析GDP数据图,模型能够准确列出欧洲GDP最高的五个国家;通过分析教育投资回报率(ROI)的箱线图,模型能识别出高学费不一定带来更高长期回报的趋势,并给出了关键的洞察,强调了除学费外其他因素对投资回报的重要性。

💡 **函数调用与结构化交互**:Mistral-Small-3.2-24B-Instruct-2506支持函数调用,能够识别用户问题是否需要外部数据,并调用正确的函数及参数,实现与外部系统的结构化交互,这对于构建更智能、更具适应性的应用程序至关重要。

Today, we’re excited to announce that Mistral-Small-3.2-24B-Instruct-2506—a 24-billion-parameter large language model (LLM) from Mistral AI that’s optimized for enhanced instruction following and reduced repetition errors—is available for customers through Amazon SageMaker JumpStart and Amazon Bedrock Marketplace. Amazon Bedrock Marketplace is a capability in Amazon Bedrock that developers can use to discover, test, and use over 100 popular, emerging, and specialized foundation models (FMs) alongside the current selection of industry-leading models in Amazon Bedrock.

In this post, we walk through how to discover, deploy, and use Mistral-Small-3.2-24B-Instruct-2506 through Amazon Bedrock Marketplace and with SageMaker JumpStart.

Overview of Mistral Small 3.2 (2506)

Mistral Small 3.2 (2506) is an update of Mistral-Small-3.1-24B-Instruct-2503, maintaining the same 24-billion-parameter architecture while delivering improvements in key areas. Released under Apache 2.0 license, this model maintains a balance between performance and computational efficiency. Mistral offers both the pretrained (Mistral-Small-3.1-24B-Base-2503) and instruction-tuned (Mistral-Small-3.2-24B-Instruct-2506) checkpoints of the model under Apache 2.0.

Key improvements in Mistral Small 3.2 (2506) include:

These improvements make the model particularly well-suited for enterprise applications on AWS where reliability and precision are critical. With a 128,000-token context window, the model can process extensive documents and maintain context throughout longer conversation.

SageMaker JumpStart overview

SageMaker JumpStart is a fully managed service that offers state-of-the-art FMs for various use cases such as content writing, code generation, question answering, copywriting, summarization, classification, and information retrieval. It provides a collection of pre-trained models that you can deploy quickly, accelerating the development and deployment of machine learning (ML) applications. One of the key components of SageMaker JumpStart is model hubs, which offer a vast catalog of pre-trained models, such as Mistral, for a variety of tasks.

You can now discover and deploy Mistral models in Amazon SageMaker Studio or programmatically through the Amazon SageMaker Python SDK, deriving model performance and MLOps controls with SageMaker features such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The model is deployed in a secure AWS environment and under your virtual private cloud (VPC) controls, helping to support data security for enterprise security needs.

Prerequisites

To deploy Mistral-Small-3.2-24B-Instruct-2506, you must have the following prerequisites:

If needed, request a quota increase and contact your AWS account team for support. This model requires a GPU-based instance type (approximately 55 GB of GPU RAM in bf16 or fp16) such as ml.g6.12xlarge.

Deploy Mistral-Small-3.2-24B-Instruct-2506 in Amazon Bedrock Marketplace

To access Mistral-Small-3.2-24B-Instruct-2506 in Amazon Bedrock Marketplace, complete the following steps:

    On the Amazon Bedrock console, in the navigation pane under Discover, choose Model catalog. Filter for Mistral as a provider and choose the Mistral-Small-3.2-24B-Instruct-2506 model.

The model detail page provides essential information about the model’s capabilities, pricing structure, and implementation guidelines. You can find detailed usage instructions, including sample API calls and code snippets for integration.The page also includes deployment options and licensing information to help you get started with Mistral-Small-3.2-24B-Instruct-2506 in your applications.

    To begin using Mistral-Small-3.2-24B-Instruct-2506, choose Deploy. You will be prompted to configure the deployment details for Mistral-Small-3.2-24B-Instruct-2506. The model ID will be pre-populated.
      For Endpoint name, enter an endpoint name (up to 50 alphanumeric characters). For Number of instances, enter a number between 1–100. For Instance type, choose your instance type. For optimal performance with Mistral-Small-3.2-24B-Instruct-2506, a GPU-based instance type such as ml.g6.12xlarge is recommended. Optionally, configure advanced security and infrastructure settings, including VPC networking, service role permissions, and encryption settings. For most use cases, the default settings will work well. However, for production deployments, review these settings to align with your organization’s security and compliance requirements.
    Choose Deploy to begin using the model.

When the deployment is complete, you can test Mistral-Small-3.2-24B-Instruct-2506 capabilities directly in the Amazon Bedrock playground, a tool on the Amazon Bedrock console to provide a visual interface to experiment with running different models.

    Choose Open in playground to access an interactive interface where you can experiment with different prompts and adjust model parameters such as temperature and maximum length.

The playground provides immediate feedback, helping you understand how the model responds to various inputs and letting you fine-tune your prompts for optimal results.

To invoke the deployed model programmatically with Amazon Bedrock APIs, you need to get the endpoint Amazon Resource Name (ARN). You can use the Converse API for multimodal use cases. For tool use and function calling, use the Invoke Model API.

Reasoning of complex figures

VLMs excel at interpreting and reasoning about complex figures, charts, and diagrams. In this particular use case, we use Mistral-Small-3.2-24B-Instruct-2506 to analyze an intricate image containing GDP data. Its advanced capabilities in document understanding and complex figure analysis make it well-suited for extracting insights from visual representations of economic data. By processing both the visual elements and accompanying text, Mistral Small 2506 can provide detailed interpretations and reasoned analysis of the GDP figures presented in the image.

We use the following input image.

We have defined helper functions to invoke the model using the Amazon Bedrock Converse API:

def get_image_format(image_path):    with Image.open(image_path) as img:        # Normalize the format to a known valid one        fmt = img.format.lower() if img.format else 'jpeg'        # Convert 'jpg' to 'jpeg'        if fmt == 'jpg':            fmt = 'jpeg'    return fmtdef call_bedrock_model(model_id=None, prompt="", image_paths=None, system_prompt="", temperature=0.6, top_p=0.9, max_tokens=3000):        if isinstance(image_paths, str):        image_paths = [image_paths]    if image_paths is None:        image_paths = []        # Start building the content array for the user message    content_blocks = []    # Include a text block if prompt is provided    if prompt.strip():        content_blocks.append({"text": prompt})    # Add images as raw bytes    for img_path in image_paths:        fmt = get_image_format(img_path)        # Read the raw bytes of the image (no base64 encoding!)        with open(img_path, 'rb') as f:            image_raw_bytes = f.read()        content_blocks.append({            "image": {                "format": fmt,                "source": {                    "bytes": image_raw_bytes                }            }        })    # Construct the messages structure    messages = [        {            "role": "user",            "content": content_blocks        }    ]    # Prepare additional kwargs if system prompts are provided    kwargs = {}        kwargs["system"] = [{"text": system_prompt}]    # Build the arguments for the `converse` call    converse_kwargs = {        "messages": messages,        "inferenceConfig": {            "maxTokens": 4000,            "temperature": temperature,            "topP": top_p        },        **kwargs    }        converse_kwargs["modelId"] = model_id    # Call the converse API    try:        response = client.converse(**converse_kwargs)            # Parse the assistant response        assistant_message = response.get('output', {}).get('message', {})        assistant_content = assistant_message.get('content', [])        result_text = "".join(block.get('text', '') for block in assistant_content)    except Exception as e:        result_text = f"Error message: {e}"    return result_text

Our prompt and input payload are as follows:

import boto3import base64import jsonfrom PIL import Imagefrom botocore.exceptions import ClientError# Create a Bedrock Runtime client in the AWS Region you want to use.client = boto3.client("bedrock-runtime", region_name="us-west-2")system_prompt='You are a Global Economist.'task = 'List the top 5 countries in Europe with the highest GDP'image_path = './image_data/gdp.png'print('Input Image:\n\n')Image.open(image_path).show()response = call_bedrock_model(model_id=endpoint_arn,                    prompt=task,                    system_prompt=system_prompt,                   image_paths = image_path)print(f'\nResponse from the model:\n\n{response}')

The following is a response using the Converse API:

Based on the image provided, the top 5 countries in Europe with the highest GDP are:1. **Germany**: $3.99T (4.65%)2. **United Kingdom**: $2.82T (3.29%)3. **France**: $2.78T (3.24%)4. **Italy**: $2.07T (2.42%)5. **Spain**: $1.43T (1.66%)These countries are highlighted in green, indicating their location in the Europe region.

Deploy Mistral-Small-3.2-24B-Instruct-2506 in SageMaker JumpStart

You can access Mistral-Small-3.2-24B-Instruct-2506 through SageMaker JumpStart in the SageMaker JumpStart UI and the SageMaker Python SDK. SageMaker JumpStart is an ML hub with FMs, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can customize pre-trained models to your use case, with your data, and deploy them into production using either the UI or SDK.

Deploy Mistral-Small-3.2-24B-Instruct-2506 through the SageMaker JumpStart UI

Complete the following steps to deploy the model using the SageMaker JumpStart UI:

    On the SageMaker console, choose Studio in the navigation pane. First-time users will be prompted to create a domain. If not, choose Open Studio. On the SageMaker Studio console, access SageMaker JumpStart by choosing JumpStart in the navigation pane.

    Search for and choose Mistral-Small-3.2-24B-Instruct-2506 to view the model card.

    Click the model card to view the model details page. Before you deploy the model, review the configuration and model details from this model card. The model details page includes the following information:

    Choose Deploy to proceed with deployment.
      For Endpoint name, enter an endpoint name (up to 50 alphanumeric characters). For Number of instances, enter a number between 1–100 (default: 1). For Instance type, choose your instance type. For optimal performance with Mistral-Small-3.2-24B-Instruct-2506, a GPU-based instance type such as ml.g6.12xlarge is recommended.

    Choose Deploy to deploy the model and create an endpoint.

When deployment is complete, your endpoint status will change to InService. At this point, the model is ready to accept inference requests through the endpoint. You can invoke the model using a SageMaker runtime client and integrate it with your applications.

Deploy Mistral-Small-3.2-24B-Instruct-2506 with the SageMaker Python SDK

Deployment starts when you choose Deploy. After deployment finishes, you will see that an endpoint is created. Test the endpoint by passing a sample inference request payload or by selecting the testing option using the SDK. When you select the option to use the SDK, you will see example code that you can use in the notebook editor of your choice in SageMaker Studio.

To deploy using the SDK, start by selecting the Mistral-Small-3.2-24B-Instruct-2506 model, specified by the model_id with the value mistral-small-3.2-24B-instruct-2506. You can deploy your choice of the selected models on SageMaker using the following code. Similarly, you can deploy Mistral-Small-3.2-24B-Instruct-2506 using its model ID.

from sagemaker.jumpstart.model import JumpStartModel accept_eula = True model = JumpStartModel(model_id="huggingface-vlm-mistral-small-3.2-24b-instruct-2506") predictor = model.deploy(accept_eula=accept_eula)This deploys the model on SageMaker with default configurations, including the default instance type and default VPC configurations. You can change these configurations by specifying non-default values in JumpStartModel. The EULA value must be explicitly defined as True to accept the end-user license agreement (EULA).

After the model is deployed, you can run inference against the deployed endpoint through the SageMaker predictor:

prompt = "Hello!"payload = {    "messages": [        {            "role": "user",            "content": prompt        }    ],    "max_tokens": 4000,    "temperature": 0.15,    "top_p": 0.9,}    response = predictor.predict(payload)print(response['choices'][0]['message']['content'])We get following response:Hello! 😊 How can I assist you today?

Vision reasoning example

Using the multimodal capabilities of Mistral-Small-3.2-24B-Instruct-2506, you can process both text and images for comprehensive analysis. The following example highlights how the model can simultaneously analyze a tuition ROI chart to extract visual patterns and data points. The following image is the input chart.png.

Our prompt and input payload are as follows:

# Read and encode the imageimage_path = "chart.png"with open(image_path, "rb") as image_file:base64_image = base64.b64encode(image_file.read()).decode('utf-8')# Create a prompt focused on visual analysis of the box plot chartvisual_prompt = """Please analyze this box plot chart showing the relationship between Annual Tuition (x-axis) and 40-Year Net Present Value (y-axis) in US$. Describe the key trend between tuition and net present value shown in this chart. What's one notable insight?"""# Create payload with image inputpayload = {"messages": [{"role": "user","content": [{"type": "text", "text": visual_prompt},{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{base64_image}"}}]}],"max_tokens": 800,"temperature": 0.15}# Make a predictionresponse = predictor.predict(payload)# Display the visual analysismessage = response['choices'][0]['message']if message.get('content'):print("Vision Analysis:")print(message['content'])

We get following response:

Vision Analysis:This box plot chart illustrates the relationship between annual tuition costs (x-axis) and the 40-year net present value (NPV) in US dollars (y-axis). Each box plot represents a range of annual tuition costs, showing the distribution of NPV values within that range.### Key Trend:1. **General Distribution**: Across all tuition ranges, the median 40-year NPV (indicated by the line inside each box) appears to be relatively consistent, hovering around the $1,000,000 mark.2. **Variability**: The spread of NPV values (indicated by the height of the boxes and whiskers) is wider for higher tuition ranges, suggesting greater variability in outcomes for more expensive schools.3. **Outliers**: There are several outliers, particularly in the higher tuition ranges (e.g., 35-40k, 40-45k, and >50k), indicating that some individuals experience significantly higher or lower NPVs.### Notable Insight:One notable insight from this chart is that higher tuition costs do not necessarily translate into a higher 40-year net present value. For example, the median NPV for the highest tuition range (>50k) is not significantly higher than that for the lowest tuition range (<5k). This suggests that the return on investment for higher tuition costs may not be proportionally greater, and other factors beyond tuition cost may play a significant role in determining long-term financial outcomes.This insight highlights the importance of considering factors beyond just tuition costs when evaluating the potential return on investment of higher education.

Function calling example

This following example shows Mistral Small 3.2’s function calling by demonstrating how the model identifies when a user question needs external data and calls the correct function with proper parameters.Our prompt and input payload are as follows:

# Define a simple weather functionweather_function = {"type": "function","function": {"name": "get_weather","description": "Get weather for a location","parameters": {"type": "object","properties": {"location": {"type": "string","description": "City name"}},"required": ["location"]}}}# User questionuser_question = "What's the weather like in Seattle?"# Create payloadpayload = {"messages": [{"role": "user", "content": user_question}],"tools": [weather_function],"tool_choice": "auto","max_tokens": 200,"temperature": 0.15}# Make predictionresponse = predictor.predict(payload)# Display raw response to see exactly what we getprint(json.dumps(response['choices'][0]['message'], indent=2))# Extract function call information from the response contentmessage = response['choices'][0]['message']content = message.get('content', '')if '[TOOL_CALLS]' in content:print("Function call details:", content.replace('[TOOL_CALLS]', ''))

We get following response:

{"role": "assistant","reasoning_content": null,"content": "[TOOL_CALLS]get_weather{\"location\": \"Seattle\"}","tool_calls": []}Function call details: get_weather{"location": "Seattle"}

Clean up

To avoid unwanted charges, complete the following steps in this section to clean up your resources.

Delete the Amazon Bedrock Marketplace deployment

If you deployed the model using Amazon Bedrock Marketplace, complete the following steps:

    On the Amazon Bedrock console, under Tune in the navigation pane, select Marketplace model deployment. In the Managed deployments section, locate the endpoint you want to delete. Select the endpoint, and on the Actions menu, choose Delete. Verify the endpoint details to make sure you’re deleting the correct deployment:
      Endpoint name Model name Endpoint status
    Choose Delete to delete the endpoint. In the deletion confirmation dialog, review the warning message, enter confirm, and choose Delete to permanently remove the endpoint.

Delete the SageMaker JumpStart predictor

After you’re done running the notebook, make sure to delete the resources that you created in the process to avoid additional billing. For more details, see Delete Endpoints and Resources. You can use the following code:

predictor.delete_model()predictor.delete_endpoint()

Conclusion

In this post, we showed you how to get started with Mistral-Small-3.2-24B-Instruct-2506 and deploy the model using Amazon Bedrock Marketplace and SageMaker JumpStart for inference. This latest version of the model brings improvements in instruction following, reduced repetition errors, and enhanced function calling capabilities while maintaining performance across text and vision tasks. The model’s multimodal capabilities, combined with its improved reliability and precision, support enterprise applications requiring robust language understanding and generation.

Visit SageMaker JumpStart in Amazon SageMaker Studio or Amazon Bedrock Marketplace now to get started with Mistral-Small-3.2-24B-Instruct-2506.

For more Mistral resources on AWS, check out the Mistral-on-AWS GitHub repo.


About the authors

Niithiyn Vijeaswaran is a Generative AI Specialist Solutions Architect with the Third-Party Model Science team at AWS. His area of focus is AWS AI accelerators (AWS Neuron). He holds a Bachelor’s degree in Computer Science and Bioinformatics.

Breanne Warner is an Enterprise Solutions Architect at Amazon Web Services supporting healthcare and life science (HCLS) customers. She is passionate about supporting customers to use generative AI on AWS and evangelizing model adoption for first- and third-party models. Breanne is also Vice President of the Women at Amazon board with the goal of fostering inclusive and diverse culture at Amazon. Breanne holds a Bachelor’s of Science in Computer Engineering from the University of Illinois Urbana-Champaign.

Koushik Mani is an Associate Solutions Architect at AWS. He previously worked as a Software Engineer for 2 years focusing on machine learning and cloud computing use cases at Telstra. He completed his Master’s in Computer Science from the University of Southern California. He is passionate about machine learning and generative AI use cases and building solutions.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Mistral AI Amazon SageMaker Amazon Bedrock 大语言模型 多模态AI
相关文章