AWS Machine Learning Blog 2024年09月05日
Effectively manage foundation models for generative AI applications with Amazon SageMaker Model Registry
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊SageMaker的ModelRegistry新增功能,可更好地管理生成式AI基础模型,包括简化注册、加速部署等,适用于多种场景。

🎯ModelRegistry助力生成式AI应用,提供丰富功能以实现模型管理的卓越运营。它的ModelRegistry有助于对模型版本进行编目和管理,促进协作与治理,新用户训练并评估后的模型可存储其中进行管理。

💡ModelRegistry发布新功能,使注册微调后的基础模型更简便。现在可注册未压缩的模型工件并设置EULA接受标志,无需用户干预,还能减少端点启动时的延迟。

🚀ModelRegistry支持自动填充某些模型的推理规范文件,包括一些AWS Marketplace模型等,用户可通过提供源模型URI轻松注册模型,之后可添加相关制品使其可部署。

📈随着组织在业务中持续采用生成式AI,ModelRegistry可实现基础模型的版本控制、跟踪、协作、生命周期管理和治理,助您更好地管理和采用生成式AI以实现转型成果。

Generative artificial intelligence (AI) foundation models (FMs) are gaining popularity with businesses due to their versatility and potential to address a variety of use cases. The true value of FMs is realized when they are adapted for domain specific data. Managing these models across the business and model lifecycle can introduce complexity. As FMs are adapted to different domains and data, operationalizing these pipelines becomes critical.

Amazon SageMaker, a fully managed service to build, train, and deploy machine learning (ML) models, has seen increased adoption to customize and deploy FMs that power generative AI applications. SageMaker provides rich features to build automated workflows for deploying models at scale. One of the key features that enables operational excellence around model management is the Model Registry. Model Registry helps catalog and manage model versions and facilitates collaboration and governance. When a model is trained and evaluated for performance, it can be stored in the Model Registry for model management.

Amazon SageMaker has released new features in Model Registry that make it easy to version and catalog FMs. Customers can use SageMaker to train or tune FMs, including Amazon SageMaker JumpStart and Amazon Bedrock models, and also manage these models within Model Registry. As customers begin to scale generative AI applications across various use cases such as fine-tuning for domain-specific tasks, the number of models can quickly grow. To keep track of models, versions, and associated metadata, SageMaker Model Registry can be used as an inventory of models.

In this post, we explore the new features of Model Registry that streamline FM management: you can now register unzipped model artifacts and pass an End User License Agreement (EULA) acceptance flag without needing users to intervene.

Overview

Model Registry has worked well for traditional models, which are smaller in size. For FMs, there were challenges because of their size and requirements for user intervention for EULA acceptance. With the new features in Model Registry, it’s become easier to register a fine-tuned FM within Model Registry, which then can be deployed for actual use.

A typical model development lifecycle is an iterative process. We conduct many experimentation cycles to achieve expected performance from the model. Once trained, these models can be registered in the Model Registry where they are cataloged as versions. The models can be organized in groups, the versions can be compared for their quality metrics, and models can have an associated approval status indicating if its deployable.

Once the model is manually approved, a continuous integration and continuous deployment (CI/CD) pipeline can be triggered to deploy these models to production. Optionally, Model Registry can be used as a repository of models that are approved for use by an enterprise. Various teams can then deploy these approved models from Model Registry and build applications around it.

An example workflow could follow these steps and is shown in the following diagram:

    Select a SageMaker JumpStart model and register it in Model Registry Alternatively, you can fine-tune a SageMaker JumpStart model Evaluate the model with SageMaker model evaluation. SageMaker allows for human evaluation if desired. Create a model group in the Model Registry. For each run, create a model version. Add your model group into one or more Model Registry Collections, which can be used to group registered models that are related to each other. For example, you could have a collection of large language models (LLMs) and another collection of diffusion models. Deploy the models as SageMaker Inference endpoints that can be consumed by generative AI applications.

Figure 1: Model Registry workflow for foundation models

To better support generative AI applications, Model Registry released two new features: ModelDataSource, and source model URI. The following sections will explore these features and how to use them.

ModelDataSource speeds up deployment and provides access to EULA dependent models

Until now, model artifacts had to be stored along with the inference code when a model gets registered in Model Registry in a compressed format. This posed challenges for generative AI applications where FMs are of very large size with billions of parameters. The large size of FMs when stored as zipped models was causing increased latency with SageMaker endpoint startup time because decompressing these models at run time took very long. The model_data_source parameter can now accept the location of the unzipped model artifacts in Amazon Simple Storage Service (Amazon S3) making the registration process simple. This also eliminates the need for endpoints to unzip the model weights, leading to reduced latency during endpoint startup times.

Additionally, public JumpStart models and certain FMs from independent service providers, such as LLAMA2, require that their EULA must be accepted prior to using the models. Thus, when public models from SageMaker JumpStart were tuned, they could not be stored in the Model Registry because a user needed to accept the license agreement. Model Registry added a new feature: EULA acceptance flag support within the model_data_source parameter, allowing the registration of such models. Now customers can catalog, version, associate metadata such as training metrics, and more in Model Registry for a wider variety of FMs.

Register unzipped models stored in Amazon S3 using the AWS SDK.

model_data_source = {               "S3DataSource": {                      "S3Uri": "s3://bucket/model/prefix/",                       "S3DataType": "S3Prefix",                                "CompressionType": "None",                                  "ModelAccessConfig": {                                            "AcceptEula": true                       },                 }}model = Model(                      sagemaker_session=sagemaker_session,                       image_uri=IMAGE_URI,                     model_data=model_data_source)model.register()

Register models requiring a EULA.

from sagemaker.jumpstart.model importJumpStartModelmodel_id = "meta-textgeneration-llama-2-7b"my_model = JumpStartModel(model_id=model_id)registered_model =my_model.register(accept_eula=True)predictor = registered_model.deploy()

Source model URI provides simplified registration and proprietary model support

Model Registry now supports automatic population of inference specification files for some recognized model IDs, including select AWS Marketplace models, hosted models, or versioned model packages in Model Registry. Because of SourceModelURI’s support for automatic population, you can register proprietary JumpStart models from providers such as AI21 labs, Cohere, and LightOn without needing the inference specification file, allowing your organization to use a broader set of FMs in Model Registry.

Previously, to register a trained model in the SageMaker Model Registry, you had to provide the complete inference specification required for deployment, including an Amazon Elastic Container Registry (Amazon ECR) image and the trained model file. With the launch of source_uri support, SageMaker has made it easy for users to register any model by providing a source model URI, which is a free form field that stores model ID or location to a proprietary JumpStart and Bedrock model ID, S3 location, and MLflow model ID. Rather than having to supply the details required for deploying to SageMaker hosting at the time of registrations, you can add the artifacts later on. After registration, to deploy a model, you can package the model an inference specification and update Model Registry accordingly.

For example, you can register a model in Model Registry with a model Amazon Resource Name (ARN) SourceURI.

model_arn = "<arn of the model to be registered>"registered_model_package = model.register(                model_package_group_name="model_group_name",        source_uri=model_arn)

Later, you can update the registered model with the inference specification, making it deployable on SageMaker.

model_package = sagemaker_session.sagemaker_client.create_model_package(         ModelPackageGroupName="model_group_name",         SourceUri="source_uri")mp = ModelPackage(               role=get_execution_role(sagemaker_session),       model_package_arn=model_package["ModelPackageArn"],       sagemaker_session=sagemaker_session)mp.update_inference_specification(image_uris=["ecr_image_uri"])

Register an Amazon JumpStart proprietary FM.

from sagemaker.jumpstart.model import JumpStartModelmodel_id = "ai21-contextual-answers"my_model = JumpStartModel(           model_id=model_id)model_package = my_model.register()

Conclusion

As organizations continue to adopt generative AI in different parts of their business, having robust model management and versioning becomes paramount. With Model Registry, you can achieve version control, tracking, collaboration, lifecycle management, and governance of FMs.

In this post, we explored how Model Registry can now more effectively support managing generative AI models across the model lifecycle, empowering you to better govern and adopt generative AI to achieve transformational outcomes.

To learn more about Model Registry, see Register and Deploy Models with Model Registry. To get started, visit the SageMaker console.


About the Authors

Chaitra Mathur serves as a Principal Solutions Architect at AWS, where her role involves advising clients on building robust, scalable, and secure solutions on AWS. With a keen interest in data and ML, she assists clients in leveraging AWS AI/ML and generative AI services to address their ML requirements effectively. Throughout her career, she has shared her expertise at numerous conferences and has authored several blog posts in the ML area.

Kait Healy is a Solutions Architect II at AWS. She specializes in working with startups and enterprise automotive customers, where she has experience building AI/ML solutions at scale to drive key business outcomes.

Saumitra Vikaram is a Senior Software Engineer at AWS. He is focused on AI/ML technology, ML model management, ML governance, and MLOps to improve overall organizational efficiency and productivity.

Siamak Nariman is a Senior Product Manager at AWS. He is focused on AI/ML technology, ML model management, and ML governance to improve overall organizational efficiency and productivity. He has extensive experience automating processes and deploying various technologies

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

亚马逊SageMaker ModelRegistry 生成式AI 模型管理
相关文章