Mistral-Small-3.2-24B-Instruct-2506 is now available on Amazon Bedrock Marketplace and Amazon SageMaker JumpStart

Today, we’re excited to announce that Mistral-Small-3.2-24B-Instruct-2506—a 24-billion-parameter large language model (LLM) from Mistral AI that’s optimized for enhanced instruction following and reduced repetition errors—is available for customers through Amazon SageMaker JumpStart and Amazon Bedrock Marketplace. Amazon Bedrock Marketplace is a capability in Amazon Bedrock that developers can use to discover, test, and use over 100 popular, emerging, and specialized foundation models (FMs) alongside the current selection of industry-leading models in Amazon Bedrock.

In this post, we walk through how to discover, deploy, and use Mistral-Small-3.2-24B-Instruct-2506 through Amazon Bedrock Marketplace and with SageMaker JumpStart.

Overview of Mistral Small 3.2 (2506)

Mistral Small 3.2 (2506) is an update of Mistral-Small-3.1-24B-Instruct-2503, maintaining the same 24-billion-parameter architecture while delivering improvements in key areas. Released under Apache 2.0 license, this model maintains a balance between performance and computational efficiency. Mistral offers both the pretrained (Mistral-Small-3.1-24B-Base-2503) and instruction-tuned (Mistral-Small-3.2-24B-Instruct-2506) checkpoints of the model under Apache 2.0.

Key improvements in Mistral Small 3.2 (2506) include:

Improves in following precise instructions with 84.78% accuracy compared to 82.75% in version 3.1 from Mistral’s benchmarks Produces twice as fewer infinite generations or repetitive answers, reducing from 2.11% to 1.29% according to Mistral Offers a more robust and reliable function calling template for structured API interactions Now includes image-text-to-text capabilities, allowing the model to process and reason over both textual and visual inputs. This makes it ideal for tasks such as document understanding, visual Q&A, and image-grounded content generation.

These improvements make the model particularly well-suited for enterprise applications on AWS where reliability and precision are critical. With a 128,000-token context window, the model can process extensive documents and maintain context throughout longer conversation.

SageMaker JumpStart overview

SageMaker JumpStart is a fully managed service that offers state-of-the-art FMs for various use cases such as content writing, code generation, question answering, copywriting, summarization, classification, and information retrieval. It provides a collection of pre-trained models that you can deploy quickly, accelerating the development and deployment of machine learning (ML) applications. One of the key components of SageMaker JumpStart is model hubs, which offer a vast catalog of pre-trained models, such as Mistral, for a variety of tasks.

You can now discover and deploy Mistral models in Amazon SageMaker Studio or programmatically through the Amazon SageMaker Python SDK, deriving model performance and MLOps controls with SageMaker features such as Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The model is deployed in a secure AWS environment and under your virtual private cloud (VPC) controls, helping to support data security for enterprise security needs.

Prerequisites

To deploy Mistral-Small-3.2-24B-Instruct-2506, you must have the following prerequisites:

AWS Identity and Access Management (IAM)

Identity and Access Management for Amazon SageMaker

If needed, request a quota increase and contact your AWS account team for support. This model requires a GPU-based instance type (approximately 55 GB of GPU RAM in bf16 or fp16) such as ml.g6.12xlarge.

Deploy Mistral-Small-3.2-24B-Instruct-2506 in Amazon Bedrock Marketplace

To access Mistral-Small-3.2-24B-Instruct-2506 in Amazon Bedrock Marketplace, complete the following steps:

Discover

Model catalog

The model detail page provides essential information about the model’s capabilities, pricing structure, and implementation guidelines. You can find detailed usage instructions, including sample API calls and code snippets for integration.The page also includes deployment options and licensing information to help you get started with Mistral-Small-3.2-24B-Instruct-2506 in your applications.

Deploy

Endpoint name

Number of instances

Instance type

Deploy

When the deployment is complete, you can test Mistral-Small-3.2-24B-Instruct-2506 capabilities directly in the Amazon Bedrock playground, a tool on the Amazon Bedrock console to provide a visual interface to experiment with running different models.

Open in playground

The playground provides immediate feedback, helping you understand how the model responds to various inputs and letting you fine-tune your prompts for optimal results.

To invoke the deployed model programmatically with Amazon Bedrock APIs, you need to get the endpoint Amazon Resource Name (ARN). You can use the Converse API for multimodal use cases. For tool use and function calling, use the Invoke Model API.

Reasoning of complex figures

VLMs excel at interpreting and reasoning about complex figures, charts, and diagrams. In this particular use case, we use Mistral-Small-3.2-24B-Instruct-2506 to analyze an intricate image containing GDP data. Its advanced capabilities in document understanding and complex figure analysis make it well-suited for extracting insights from visual representations of economic data. By processing both the visual elements and accompanying text, Mistral Small 2506 can provide detailed interpretations and reasoned analysis of the GDP figures presented in the image.

We use the following input image.

We have defined helper functions to invoke the model using the Amazon Bedrock Converse API:

def get_image_format(image_path):    with Image.open(image_path) as img:        # Normalize the format to a known valid one        fmt = img.format.lower() if img.format else 'jpeg'        # Convert 'jpg' to 'jpeg'        if fmt == 'jpg':            fmt = 'jpeg'    return fmtdef call_bedrock_model(model_id=None, prompt="", image_paths=None, system_prompt="", temperature=0.6, top_p=0.9, max_tokens=3000):        if isinstance(image_paths, str):        image_paths = [image_paths]    if image_paths is None:        image_paths = []        # Start building the content array for the user message    content_blocks = []    # Include a text block if prompt is provided    if prompt.strip():        content_blocks.append({"text": prompt})    # Add images as raw bytes    for img_path in image_paths:        fmt = get_image_format(img_path)        # Read the raw bytes of the image (no base64 encoding!)        with open(img_path, 'rb') as f:            image_raw_bytes = f.read()        content_blocks.append({            "image": {                "format": fmt,                "source": {                    "bytes": image_raw_bytes                }            }        })    # Construct the messages structure    messages = [        {            "role": "user",            "content": content_blocks        }    ]    # Prepare additional kwargs if system prompts are provided    kwargs = {}        kwargs["system"] = [{"text": system_prompt}]    # Build the arguments for the `converse` call    converse_kwargs = {        "messages": messages,        "inferenceConfig": {            "maxTokens": 4000,            "temperature": temperature,            "topP": top_p        },        **kwargs    }        converse_kwargs["modelId"] = model_id    # Call the converse API    try:        response = client.converse(**converse_kwargs)            # Parse the assistant response        assistant_message = response.get('output', {}).get('message', {})        assistant_content = assistant_message.get('content', [])        result_text = "".join(block.get('text', '') for block in assistant_content)    except Exception as e:        result_text = f"Error message: {e}"    return result_text

Our prompt and input payload are as follows:

import boto3import base64import jsonfrom PIL import Imagefrom botocore.exceptions import ClientError# Create a Bedrock Runtime client in the AWS Region you want to use.client = boto3.client("bedrock-runtime", region_name="us-west-2")system_prompt='You are a Global Economist.'task = 'List the top 5 countries in Europe with the highest GDP'image_path = './image_data/gdp.png'print('Input Image:\n\n')Image.open(image_path).show()response = call_bedrock_model(model_id=endpoint_arn,                    prompt=task,                    system_prompt=system_prompt,                   image_paths = image_path)print(f'\nResponse from the model:\n\n{response}')

The following is a response using the Converse API:

Based on the image provided, the top 5 countries in Europe with the highest GDP are:1. **Germany**: $3.99T (4.65%)2. **United Kingdom**: $2.82T (3.29%)3. **France**: $2.78T (3.24%)4. **Italy**: $2.07T (2.42%)5. **Spain**: $1.43T (1.66%)These countries are highlighted in green, indicating their location in the Europe region.

Deploy Mistral-Small-3.2-24B-Instruct-2506 in SageMaker JumpStart

You can access Mistral-Small-3.2-24B-Instruct-2506 through SageMaker JumpStart in the SageMaker JumpStart UI and the SageMaker Python SDK. SageMaker JumpStart is an ML hub with FMs, built-in algorithms, and prebuilt ML solutions that you can deploy with just a few clicks. With SageMaker JumpStart, you can customize pre-trained models to your use case, with your data, and deploy them into production using either the UI or SDK.

Deploy Mistral-Small-3.2-24B-Instruct-2506 through the SageMaker JumpStart UI

Complete the following steps to deploy the model using the SageMaker JumpStart UI:

Studio

Open Studio

JumpStart

Mistral-Small-3.2-24B-Instruct-2506

Click the model card to view the model details page. Before you deploy the model, review the configuration and model details from this model card. The model details page includes the following information:

Deploy

About

Notebooks

Bedrock Ready

Deploy

Endpoint name

Number of instances

Instance type

Deploy

When deployment is complete, your endpoint status will change to InService. At this point, the model is ready to accept inference requests through the endpoint. You can invoke the model using a SageMaker runtime client and integrate it with your applications.

Deploy Mistral-Small-3.2-24B-Instruct-2506 with the SageMaker Python SDK

Deployment starts when you choose Deploy. After deployment finishes, you will see that an endpoint is created. Test the endpoint by passing a sample inference request payload or by selecting the testing option using the SDK. When you select the option to use the SDK, you will see example code that you can use in the notebook editor of your choice in SageMaker Studio.

To deploy using the SDK, start by selecting the Mistral-Small-3.2-24B-Instruct-2506 model, specified by the model_id with the value mistral-small-3.2-24B-instruct-2506. You can deploy your choice of the selected models on SageMaker using the following code. Similarly, you can deploy Mistral-Small-3.2-24B-Instruct-2506 using its model ID.

from sagemaker.jumpstart.model import JumpStartModel accept_eula = True model = JumpStartModel(model_id="huggingface-vlm-mistral-small-3.2-24b-instruct-2506") predictor = model.deploy(accept_eula=accept_eula)This deploys the model on SageMaker with default configurations, including the default instance type and default VPC configurations. You can change these configurations by specifying non-default values in JumpStartModel. The EULA value must be explicitly defined as True to accept the end-user license agreement (EULA).

After the model is deployed, you can run inference against the deployed endpoint through the SageMaker predictor:

prompt = "Hello!"payload = {    "messages": [        {            "role": "user",            "content": prompt        }    ],    "max_tokens": 4000,    "temperature": 0.15,    "top_p": 0.9,}    response = predictor.predict(payload)print(response['choices'][0]['message']['content'])We get following response:Hello! 😊 How can I assist you today?

Vision reasoning example

Using the multimodal capabilities of Mistral-Small-3.2-24B-Instruct-2506, you can process both text and images for comprehensive analysis. The following example highlights how the model can simultaneously analyze a tuition ROI chart to extract visual patterns and data points. The following image is the input chart.png.

Our prompt and input payload are as follows:

# Read and encode the imageimage_path = "chart.png"with open(image_path, "rb") as image_file:base64_image = base64.b64encode(image_file.read()).decode('utf-8')# Create a prompt focused on visual analysis of the box plot chartvisual_prompt = """Please analyze this box plot chart showing the relationship between Annual Tuition (x-axis) and 40-Year Net Present Value (y-axis) in US$. Describe the key trend between tuition and net present value shown in this chart. What's one notable insight?"""# Create payload with image inputpayload = {"messages": [{"role": "user","content": [{"type": "text", "text": visual_prompt},{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{base64_image}"}}]}],"max_tokens": 800,"temperature": 0.15}# Make a predictionresponse = predictor.predict(payload)# Display the visual analysismessage = response['choices'][0]['message']if message.get('content'):print("Vision Analysis:")print(message['content'])

We get following response:

Vision Analysis:This box plot chart illustrates the relationship between annual tuition costs (x-axis) and the 40-year net present value (NPV) in US dollars (y-axis). Each box plot represents a range of annual tuition costs, showing the distribution of NPV values within that range.### Key Trend:1. **General Distribution**: Across all tuition ranges, the median 40-year NPV (indicated by the line inside each box) appears to be relatively consistent, hovering around the $1,000,000 mark.2. **Variability**: The spread of NPV values (indicated by the height of the boxes and whiskers) is wider for higher tuition ranges, suggesting greater variability in outcomes for more expensive schools.3. **Outliers**: There are several outliers, particularly in the higher tuition ranges (e.g., 35-40k, 40-45k, and >50k), indicating that some individuals experience significantly higher or lower NPVs.### Notable Insight:One notable insight from this chart is that higher tuition costs do not necessarily translate into a higher 40-year net present value. For example, the median NPV for the highest tuition range (>50k) is not significantly higher than that for the lowest tuition range (<5k). This suggests that the return on investment for higher tuition costs may not be proportionally greater, and other factors beyond tuition cost may play a significant role in determining long-term financial outcomes.This insight highlights the importance of considering factors beyond just tuition costs when evaluating the potential return on investment of higher education.

Function calling example

This following example shows Mistral Small 3.2’s function calling by demonstrating how the model identifies when a user question needs external data and calls the correct function with proper parameters.Our prompt and input payload are as follows:

# Define a simple weather functionweather_function = {"type": "function","function": {"name": "get_weather","description": "Get weather for a location","parameters": {"type": "object","properties": {"location": {"type": "string","description": "City name"}},"required": ["location"]}}}# User questionuser_question = "What's the weather like in Seattle?"# Create payloadpayload = {"messages": [{"role": "user", "content": user_question}],"tools": [weather_function],"tool_choice": "auto","max_tokens": 200,"temperature": 0.15}# Make predictionresponse = predictor.predict(payload)# Display raw response to see exactly what we getprint(json.dumps(response['choices'][0]['message'], indent=2))# Extract function call information from the response contentmessage = response['choices'][0]['message']content = message.get('content', '')if '[TOOL_CALLS]' in content:print("Function call details:", content.replace('[TOOL_CALLS]', ''))

We get following response:

{"role": "assistant","reasoning_content": null,"content": "[TOOL_CALLS]get_weather{\"location\": \"Seattle\"}","tool_calls": []}Function call details: get_weather{"location": "Seattle"}

Clean up

To avoid unwanted charges, complete the following steps in this section to clean up your resources.

Delete the Amazon Bedrock Marketplace deployment

If you deployed the model using Amazon Bedrock Marketplace, complete the following steps:

Tune

Marketplace model deployment.

Managed deployments

Actions

Delete

Endpoint name Model name Endpoint status

Delete

Delete the SageMaker JumpStart predictor

After you’re done running the notebook, make sure to delete the resources that you created in the process to avoid additional billing. For more details, see Delete Endpoints and Resources. You can use the following code:

predictor.delete_model()predictor.delete_endpoint()

Conclusion

In this post, we showed you how to get started with Mistral-Small-3.2-24B-Instruct-2506 and deploy the model using Amazon Bedrock Marketplace and SageMaker JumpStart for inference. This latest version of the model brings improvements in instruction following, reduced repetition errors, and enhanced function calling capabilities while maintaining performance across text and vision tasks. The model’s multimodal capabilities, combined with its improved reliability and precision, support enterprise applications requiring robust language understanding and generation.

Visit SageMaker JumpStart in Amazon SageMaker Studio or Amazon Bedrock Marketplace now to get started with Mistral-Small-3.2-24B-Instruct-2506.

For more Mistral resources on AWS, check out the Mistral-on-AWS GitHub repo.

About the authors

Niithiyn Vijeaswaran is a Generative AI Specialist Solutions Architect with the Third-Party Model Science team at AWS. His area of focus is AWS AI accelerators (AWS Neuron). He holds a Bachelor’s degree in Computer Science and Bioinformatics.

Breanne Warner is an Enterprise Solutions Architect at Amazon Web Services supporting healthcare and life science (HCLS) customers. She is passionate about supporting customers to use generative AI on AWS and evangelizing model adoption for first- and third-party models. Breanne is also Vice President of the Women at Amazon board with the goal of fostering inclusive and diverse culture at Amazon. Breanne holds a Bachelor’s of Science in Computer Engineering from the University of Illinois Urbana-Champaign.

Koushik Mani is an Associate Solutions Architect at AWS. He previously worked as a Software Engineer for 2 years focusing on machine learning and cloud computing use cases at Telstra. He completed his Master’s in Computer Science from the University of Southern California. He is passionate about machine learning and generative AI use cases and building solutions.

Overview of Mistral Small 3.2 (2506)

SageMaker JumpStart overview

Prerequisites

Deploy Mistral-Small-3.2-24B-Instruct-2506 in Amazon Bedrock Marketplace

Reasoning of complex figures

Deploy Mistral-Small-3.2-24B-Instruct-2506 in SageMaker JumpStart

Deploy Mistral-Small-3.2-24B-Instruct-2506 through the SageMaker JumpStart UI

Deploy Mistral-Small-3.2-24B-Instruct-2506 with the SageMaker Python SDK

Vision reasoning example

Function calling example

Clean up

Delete the Amazon Bedrock Marketplace deployment

Delete the SageMaker JumpStart predictor

Conclusion

About the authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签