AWS Machine Learning Blog 02月12日
Falcon 3 models now available in Amazon SageMaker JumpStart
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文介绍了Technology Innovation Institute (TII)开发的Falcon 3系列模型在Amazon SageMaker JumpStart上的应用。Falcon 3系列包含从10亿到100亿参数的五个基础模型,专注于增强科学、数学和编码能力。SageMaker JumpStart提供了这些模型的便捷部署方式,包括通过UI界面和SageMaker Python SDK。文章详细阐述了两种部署方法,并介绍了如何使用这些模型进行推理和清理资源。Falcon 3模型的加入,使得数据科学家和ML工程师能够更轻松地发现、访问和运行预训练的FM模型,加速机器学习的进程。

🚀Falcon 3模型系列由TII开发,包含多个不同规模的基础模型,旨在提升模型在科学、数学和编码方面的能力。其中,Falcon3-10B-Base模型在130亿参数以下的模型中,实现了零样本和少样本任务的领先性能。

💻SageMaker JumpStart提供两种部署Falcon 3模型的方式:一是通过直观的UI界面进行操作,用户可以在SageMaker Studio中搜索并部署模型;二是通过SageMaker Python SDK以编程方式进行部署,方便自动化和集成到MLOps流程中。

⚙️通过SageMaker Python SDK部署Falcon 3模型时,需要配置模型构建器(ModelBuilder)和模式构建器(SchemaBuilder),并设置必要的参数,如实例类型、输入输出示例等。部署完成后,可以使用预测器(predictor)进行推理,并通过代码清理模型和端点。

💰文章还提到了如何利用SageMaker Inference的新功能,实现在部署后缩减到零的成本节约,并提供了相关文档链接。这有助于用户在实际应用中优化资源利用和降低成本。

Today, we are excited to announce that the Falcon 3 family of models from TII are available in Amazon SageMaker JumpStart. In this post, we explore how to deploy this model efficiently on Amazon SageMaker AI.

Overview of the Falcon 3 family of models

The Falcon 3 family, developed by Technology Innovation Institute (TII) in Abu Dhabi, represents a significant advancement in open source language models. This collection includes five base models ranging from 1 billion to 10 billion parameters, with a focus on enhancing science, math, and coding capabilities. The family consists of Falcon3-1B-Base, Falcon3-3B-Base, Falcon3-Mamba-7B-Base, Falcon3-7B-Base, and Falcon3-10B-Base, along

These models showcase innovations such as efficient pre-training techniques, scaling for improved reasoning, and knowledge distillation for better performance in smaller models. Notably, the Falcon3-10B-Base model achieves state-of-the-art performance for models under 13 billion parameters in zero-shot and few-shot tasks. The Falcon 3 family also includes various fine-tuned versions like Instruct models and supports different quantization formats, making them versatile for a wide range of applications.

Currently, SageMaker JumpStart offers the base versions of Falcon3-3B, Falcon3-7B, and Falcon3-10B, along with their corresponding instruct variants, as well as Falcon3-1B-Instruct.

Get started with SageMaker JumpStart

SageMaker JumpStart is a machine learning (ML) hub that can help accelerate your ML journey. With SageMaker JumpStart, you can evaluate, compare, and select pre-trained foundation models (FMs), including Falcon 3 models. These models are fully customizable for your use case with your data.

Deploying a Falcon 3 model through SageMaker JumpStart offers two convenient approaches: using the intuitive SageMaker JumpStart UI or implementing programmatically through the SageMaker Python SDK. Let’s explore both methods to help you choose the approach that best suits your needs.

Deploy Falcon 3 using the SageMaker JumpStart UI

Complete the following steps to deploy Falcon 3 through the JumpStart UI:

    To access SageMaker JumpStart, use one of the following methods:
      In Amazon SageMaker Unified Studio, on the Build menu, choose JumpStart models under Model development.
      Alternatively, in Amazon SageMaker Studio, choose JumpStart in the navigation pane.
    Search for Falcon3-10B-Base in the model browser.
    Choose the model and choose Deploy.
    For Instance type, either use the default instance or choose a different instance. Choose Deploy.
    After some time, the endpoint status will show as InService and you will be able to run inference against it.

Deploy Falcon 3 programmatically using the SageMaker Python SDK

For teams looking to automate deployment or integrate with existing MLOps pipelines, you can use the SageMaker Python SDK:

from sagemaker.serve.builder.model_builder import ModelBuilderfrom sagemaker.serve.builder.schema_builder import SchemaBuilderfrom sagemaker.jumpstart.model import ModelAccessConfigfrom sagemaker.session import Sessionimport loggingsagemaker_session = Session()artifacts_bucket_name = sagemaker_session.default_bucket()execution_role_arn = sagemaker_session.get_caller_identity_arn()js_model_id = "huggingface-llm-falcon-3-10B-base"gpu_instance_type = "ml.g5.12xlarge"  response = "Hello, I'm a language model, and I'm here to help you with your English."sample_input = {    "inputs": "Hello, I'm a language model,",    "parameters": {"max_new_tokens": 128, "top_p": 0.9, "temperature": 0.6},}sample_output = [{"generated_text": response}]schema_builder = SchemaBuilder(sample_input, sample_output)model_builder = ModelBuilder(    model=js_model_id,    schema_builder=schema_builder,    sagemaker_session=sagemaker_session,    role_arn=execution_role_arn,    log_level=logging.ERROR)model= model_builder.build()predictor = model.deploy(model_access_configs={js_model_id:ModelAccessConfig(accept_eula=True)}, accept_eula=True)

Run inference on the predictor:

predictor.predict(sample_input)

If you want to set up the ability to scale down to zero after deployment, refer to Unlock cost savings with the new scale down to zero feature in SageMaker Inference.

Clean up

To clean up the model and endpoint, use the following code:

predictor.delete_model()predictor.delete_endpoint()

Conclusion

In this post, we explored how SageMaker JumpStart empowers data scientists and ML engineers to discover, access, and run a wide range of pre-trained FMs for inference, including the Falcon 3 family of models. Visit SageMaker JumpStart in SageMaker Studio now to get started. For more information, refer to SageMaker JumpStart pretrained models, Amazon SageMaker JumpStart Foundation Models, and Getting started with Amazon SageMaker JumpStart.


About the authors

Niithiyn Vijeaswaran is a Generative AI Specialist Solutions Architect with the Third-Party Model Science team at AWS. His area of focus is generative AI and AWS AI Accelerators. He holds a Bachelor’s degree in Computer Science and Bioinformatics.

Marc Karp is an ML Architect with the Amazon SageMaker Service team. He focuses on helping customers design, deploy, and manage ML workloads at scale. In his spare time, he enjoys traveling and exploring new places.

Raghu Ramesha is a Senior ML Solutions Architect with the Amazon SageMaker Service team. He focuses on helping customers build, deploy, and migrate ML production workloads to SageMaker at scale. He specializes in machine learning, AI, and computer vision domains, and holds a master’s degree in Computer Science from UT Dallas. In his free time, he enjoys traveling and photography.

Banu Nagasundaram leads product, engineering, and strategic partnerships for SageMaker JumpStart, SageMaker’s machine learning and GenAI hub. She is passionate about building solutions that help customers accelerate their AI journey and unlock business value.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Falcon 3 SageMaker JumpStart 机器学习 AI模型部署
相关文章