AWS Machine Learning Blog 2024年11月14日
Improve governance of models with Amazon SageMaker unified Model Cards and Model Registry
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊 SageMaker 推出了模型注册与模型卡片的集成功能,简化了机器学习模型治理流程。用户现在可以直接在 SageMaker 模型注册表中管理模型版本的治理信息,包括模型用途、性能、风险和业务信息等。此功能尤其适用于金融服务和医疗保健等高风险或受监管行业,通过详细的模型卡片,组织可以建立负责任的机器学习系统开发流程,并帮助治理团队做出更明智的决策。此外,该集成还解决了之前模型注册和治理过程中存在的挑战,例如缺乏统一的用户体验和模型信息共享困难等,从而简化了模型生命周期的管理和部署。

🤔**模型卡片与模型注册集成:** SageMaker 模型注册表现在支持将模型卡片与模型版本关联,方便用户管理模型的治理信息,例如模型用途、性能、风险和业务信息等。

🚀**简化模型治理:** 通过将模型卡片和模型注册整合,用户可以从 SageMaker 模型注册表中统一查看和管理模型,简化模型生命周期管理,提高模型治理的效率和透明度。

🔄**统一模型治理架构:** 该解决方案提供了一个统一的模型治理架构,涵盖了机器学习生命周期的各个阶段,包括模型开发、训练、评估、部署和监控,帮助组织建立端到端的模型治理流程。

📊**利用Amazon DataZone进行协作:** SageMaker与Amazon DataZone集成,允许机器学习构建者与数据工程师协作,构建机器学习用例,并通过Amazon DataZone目录共享模型信息。

🛡️**安全可扩展的模型治理:** 该方案提供了多种最佳实践,例如定义ML用例元数据、设置用例审批流程、创建ML项目和模型包组等,帮助组织建立安全可扩展的端到端模型治理流程。

You can now register machine learning (ML) models in Amazon SageMaker Model Registry with Amazon SageMaker Model Cards, making it straightforward to manage governance information for specific model versions directly in SageMaker Model Registry in just a few clicks.

Model cards are an essential component for registered ML models, providing a standardized way to document and communicate key model metadata, including intended use, performance, risks, and business information. This transparency is particularly important for registered models, which are often deployed in high-stakes or regulated industries, such as financial services and healthcare. By including detailed model cards, organizations can establish the responsible development of their ML systems, enabling better-informed decisions by the governance team.

When solving a business problem with an ML model, customers want to refine their approach and register multiple versions of the model in SageMaker Model Registry to find the best candidate model. To effectively operationalize and govern these various model versions, customers want the ability to clearly associate model cards with a particular model version. This lack of a unified user experience posed challenges for customers, who needed a more streamlined way to register and govern their models.

Because SageMaker Model Cards and SageMaker Model Registry were built on separate APIs, it was challenging to associate the model information and gain a comprehensive view of the model development lifecycle. Integrating model information and then sharing it across different stages became increasingly difficult. This required custom integration efforts, along with complex AWS Identity and Access Management (IAM) policy management, further complicating the model governance process.

With the unification of SageMaker Model Cards and SageMaker Model Registry, architects, data scientists, ML engineers, or platform engineers (depending on the organization’s hierarchy) can now seamlessly register ML model versions early in the development lifecycle, including essential business details and technical metadata. This unification allows you to review and govern models across your lifecycle from a single place in SageMaker Model Registry. By consolidating model governance workflows in SageMaker Model Registry, you can improve transparency and streamline the deployment of models to production environments upon governance officers’ approval.

In this post, we discuss a new feature that supports the integration of model cards with the model registry. We discuss the solution architecture and best practices for managing model cards with a registered model version, and walk through how to set up, operationalize, and govern your models using the integration in the model registry.

Solution overview

In this section, we discuss the solution to address the aforementioned challenges with model governance. First, we introduce the unified model governance solution architecture for addressing the model governance challenges for an end-to-end ML lifecycle in a scalable, well-architected environment. Then we dive deep into the details of the unified model registry and discuss how it helps with governance and deployment workflows.

Unified model governance architecture

ML governance enforces the ethical, legal, and efficient use of ML systems by addressing concerns like bias, transparency, explainability, and accountability. It helps organizations comply with regulations, manage risks, and maintain operational efficiency through robust model lifecycles and data quality management. Ultimately, ML governance builds stakeholder trust and aligns ML initiatives with strategic business goals, maximizing their value and impact. ML governance starts when you want to solve a business use case or problem with ML and is part of every step of your ML lifecycle, from use case inception, model building, training, evaluation, deployment, and monitoring of your production ML system.

Let’s delve into the architecture details of how you can use a unified model registry along with other AWS services to govern your ML use case and models throughout the entire ML lifecycle.

SageMaker Model Registry catalogs your models along with their versions and associated metadata and metrics for training and evaluation. It also maintains audit and inference metadata to help drive governance and deployment workflows.

The following are key concepts used in the model registry:

Additionally, this solution uses Amazon DataZone. With the integration of SageMaker and Amazon DataZone, it enables collaboration between ML builders and data engineers for building ML use cases. ML builders can request access to data published by data engineers. Upon receiving approval, ML builders can then consume the accessed data to engineer features, create models, and publish features and models to the Amazon DataZone catalog for sharing across the enterprise. As part of the SageMaker Model Cards and SageMaker Model Registry unification, ML builders can now share technical and business information about their models, including training and evaluation details, as well as business metadata such as model risk, for ML use cases.

The following diagram depicts the architecture for unified governance across your ML lifecycle.

There are several for implementing secure and scalable end-to-end governance for your ML lifecycle:

    Define your ML use case metadata (name, description, risk, and so on) for the business problem you’re trying to solve (for example, automate a loan application process). Set up and invoke your use case approval workflow for building the ML model (for example, fraud detection) for the use case. Create an ML project to create a model for the ML use case. Create a SageMaker model package group to start building the model. Associate the model to the ML project and record qualitative information about the model, such as purpose, assumptions, and owner. Prepare the data to build your model training pipeline. Evaluate your training data for data quality, including feature importance and bias, and update the model package version with relevant evaluation metrics. Train your ML model with the prepared data and register the candidate model package version with training metrics. Evaluate your trained model for model bias and model drift, and update the model package version with relevant evaluation metrics. Validate that the candidate model experimentation results meet your model governance criteria based on your use case risk profile and compliance requirements. After you receive the governance team’s approval on the candidate model, record the approval on the model package version and invoke an automated test deployment pipeline to deploy the model to a test environment. Run model validation tests in a test environment and make sure the model integrates and works with upstream and downstream dependencies similar to a production environment. After you validate the model in the test environment and make sure the model complies with use case requirements, approve the model for production deployment. After you deploy the model to the production environment, continuously monitor model performance metrics (such as quality and bias) to make sure the model stays in compliance and meets your business use case key performance indicators (KPIs).

Architecture tools, components, and environments

You need to set up several components and environments for orchestrating the solution workflow:

Integrate a model version in the model registry with model cards

In this section, we provide API implementation details for testing this in your own environment. We walk through an example notebook to demonstrate how you can use this unification during the model development data science lifecycle.

We have two example notebooks in GitHub repository: AbaloneExample and DirectMarketing.

Complete the following steps in the above Abalone example notebook:

    Install or update the necessary packages and library. Import the necessary library and instantiate the necessary variables like SageMaker client and Amazon Simple Storage Service (Amazon S3) buckets. Create an Amazon DataZone domain and a project within the domain.

You can use an existing project if you already have one. This is an optional step and we will be referencing the Amazon DataZone project ID while creating the SageMaker model package. For overall governance between your data and the model lifecycle, this can help create the correlation between business unit/domain, data and corresponding model.

The following screenshot shows the Amazon DataZone welcome page for a test domain.

In Amazon DataZone, projects enable a group of users to collaborate on various business use cases that involve creating assets in project inventories and thereby making them discoverable by all project members, and then publishing, discovering, subscribing to, and consuming assets in the Amazon DataZone catalog. Project members consume assets from the Amazon DataZone catalog and produce new assets using one or more analytical workflows. Project members can be owners or contributors.

You can gather the project ID on the project details page, as shown in the following screenshot.

In the notebook, we refer to the project ID as follows:

project_id = "5rn1teh0tv85rb"
    Prepare a SageMaker model package group.

A model group contains a group of versioned models. We refer to the Amazon DataZone project ID when we create the model package group, as shown in the following screenshot. It’s mapped to the custom_details field.

    Update the details for the model card, including the intended use and owner:
model_overview = ModelOverview(    #model_description="This is an example model used for a Python SDK demo of unified Amazon SageMaker Model Registry and Model Cards.",    #problem_type="Binary Classification",    #algorithm_type="Logistic Regression",    model_creator="DEMO-Model-Registry-ModelCard-Unification",    #model_owner="datascienceteam",)intended_uses = IntendedUses(    purpose_of_model="Test model card.",    intended_uses="Not used except this test.",    factors_affecting_model_efficiency="No.",    risk_rating=RiskRatingEnum.LOW,    explanations_for_risk_rating="Just an example.",)business_details = BusinessDetails(    business_problem="The business problem that your model is used to solve.",    business_stakeholders="The stakeholders who have the interest in the business that your model is used for.",    line_of_business="Services that the business is offering.",)additional_information = AdditionalInformation(    ethical_considerations="Your model ethical consideration.",    caveats_and_recommendations="Your model's caveats and recommendations.",    custom_details={"custom details1": "details value"},)my_card = ModelCard(    name="mr-mc-unification",    status=ModelCardStatusEnum.DRAFT,    model_overview=model_overview,    intended_uses=intended_uses,    business_details=business_details,    additional_information=additional_information,    sagemaker_session=sagemaker_session,)

This data is used to update the created model package. The SageMaker model package helps create a deployable model that you can use to get real-time inferences by creating a hosted endpoint or to run batch transform jobs.

The model card information shown as model_card=my_card in the following code snippet can be passed to the pipeline during the model register step:

register_args = model.register(    content_types=["text/csv"],    response_types=["text/csv"],    inference_instances=["ml.t2.medium", "ml.m5.large"],    transform_instances=["ml.m5.large"],    model_package_group_name=model_package_group_name,    approval_status=model_approval_status,    model_metrics=model_metrics,    drift_check_baselines=drift_check_baselines,    model_card=my_card)step_register = ModelStep(name="RegisterAbaloneModel", step_args=register_args)

Alternatively, you can pass it as follows:

step_register = RegisterModel(    name="MarketingRegisterModel",    estimator=xgb_train,    model_data=step_train.properties.ModelArtifacts.S3ModelArtifacts,    content_types=["text/csv"],    response_types=["text/csv"],    inference_instances=["ml.t2.medium", "ml.m5.xlarge"],    transform_instances=["ml.m5.xlarge"],    model_package_group_name=model_package_group_name,    approval_status=model_approval_status,    model_metrics=model_metrics,    model_card=my_card)

The notebook will invoke a run of the SageMaker pipeline (which can also be invoked from an event or from the pipelines UI), which includes preprocessing, training, and evaluation.

After the pipeline is complete, you can navigate to Amazon SageMaker Studio, where you can see a model package on the Models page.

You can view the details like business details, intended use, and more on the Overview tab under Audit, as shown in the following screenshots.

The Amazon DataZone project ID is captured in the Documentation section.

You can view performance metrics under Train as well.

Evaluation details like model quality, bias pre-training, bias post-training, and explainability can be reviewed on the Evaluate tab.

Optionally, you can view the model card details from the model package itself.

Additionally, you can update the audit details of the model by choosing Edit in the top right corner. Once you are done with your changes, choose Save to keep the changes in the model card.

Also, you can update the model’s deploy status.

You can track the different statuses and activity as well.

Lineage

ML lineage is crucial for tracking the origin, evolution, and dependencies of data, models, and code used in ML workflows, providing transparency and traceability. It helps with reproducibility and debugging, making it straightforward to understand and address issues.

Model lineage tracking captures and retains information about the stages of an ML workflow, from data preparation and training to model registration and deployment. You can view the lineage details of a registered model version in SageMaker Model Registry using SageMaker ML lineage tracking, as shown in the following screenshot. ML model lineage tracks the metadata associated with your model training and deployment workflows, including training jobs, datasets used, pipelines, endpoints, and the actual models. You can also use the graph node to view more details, such as dataset and images used in that step.

Clean up

If you created resources while using the notebook in this post, follow the instructions in the notebook to clean up those resources.

Conclusion

In this post, we discussed a solution to use a unified model registry with other AWS services to govern your ML use case and models throughout the entire ML lifecycle in your organization. We walked through an end-to-end architecture for developing an AI use case embedding governance controls, from use case inception to model building, model validation, and model deployment in production. We demonstrated through code how to register a model and update it with governance, technical, and business metadata in SageMaker Model Registry.

We encourage you to try out this solution and share your feedback in the comments section.


About the authors

Ram Vittal is a Principal ML Solutions Architect at AWS. He has over 3 decades of experience architecting and building distributed, hybrid, and cloud applications. He is passionate about building secure and scalable AI/ML and big data solutions to help enterprise customers with their cloud adoption and optimization journey to improve their business outcomes. In his spare time, he rides his motorcycle and walks with his 3-year-old Sheepadoodle.

Neelam Koshiya is principal solutions architect (GenAI specialist) at AWS. With a background in software engineering, she moved organically into an architecture role. Her current focus is to help enterprise customers with their ML/ GenAI journeys for strategic business outcomes. Her area of depth is machine learning. In her spare time, she enjoys reading and being outdoors.

Siamak Nariman is a Senior Product Manager at AWS. He is focused on AI/ML technology, ML model management, and ML governance to improve overall organizational efficiency and productivity. He has extensive experience automating processes and deploying various technologies.

Saumitra Vikaram is a Senior Software Engineer at AWS. He is focused on AI/ML technology, ML model management, ML governance, and MLOps to improve overall organizational efficiency and productivity.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

SageMaker 模型注册 模型卡片 机器学习治理 Amazon DataZone
相关文章