未知数据源 2024年10月02日
Provision and manage ML environments with Amazon SageMaker Canvas using AWS CDK and AWS Service Catalog
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

文章探讨了机器学习在各行业的广泛应用,以及非ML从业者通过无代码ML平台使用机器学习的情况。以Amazon SageMaker Canvas为例,介绍了其如何使业务分析师无需代码即可解决业务问题。还讨论了IT团队如何使用相关工具安全地配置和部署ML环境,包括AWS CDK、AWS Service Catalog等,并详细阐述了实施步骤、前提条件、解决方案及示例流程等内容。

🎯机器学习广泛应用,但ML从业者数量增长滞后,无代码ML平台使非ML从业者能利用数据使用ML,Amazon SageMaker Canvas让业务分析师无需代码即可解决业务问题,其界面简单直观,助企业快速实施解决方案。

🛠️IT团队可使用Amazon SageMaker Canvas、AWS Cloud Development Kit(AWS CDK)和AWS Service Catalog来管理、配置和管理安全的ML环境,文章提供了详细的分步指南。

📋使用Canvas配置ML环境需三步:通过AWS Service Catalog管理Canvas所需资源组合;使用AWS CDK部署示例组合;在几分钟内按需配置Canvas环境。

🌐在受监管行业和大型企业中,IT管理员可用AWS Service Catalog创建和组织安全、可重复的ML环境,并通过IaC控制满足要求,还可控制谁能访问该组合以启动产品。

📂示例流程中,AWS Service Catalog组合包含Studio domain、Amazon S3 bucket、Canvas user、Scheduled shutdown of Canvas sessions等方面,详细介绍了各部分的功能和配置。

<section class="blog-post-content"><p>The proliferation of machine learning (ML) across a wide range of use cases is becoming prevalent in every industry. However, this outpaces the increase in the number of ML practitioners who have traditionally been responsible for implementing these technical solutions to realize business outcomes.</p><p>In today’s enterprise, there is a need for machine learning to be used by non-ML practitioners who are proficient with data, which is the foundation of ML. To make this a reality, the value of ML is being realized across the enterprise through no-code ML platforms. These platforms enable different personas, for example business analysts, to use ML without writing a single line of code and deliver solutions to business problems in a quick, simple, and intuitive manner. <a href="https://aws.amazon.com/sagemaker/canvas/&quot; target="_blank" rel="noopener noreferrer">Amazon SageMaker Canvas</a> is a visual point-and-click service that enables business analysts to use ML to solve business problems by generating accurate predictions on their own—without requiring any ML experience or having to write a single line of code. Canvas has expanded the use of ML in the enterprise with a simple-to-use intuitive interface that helps businesses implement solutions quickly.</p><p>Although Canvas has enabled democratization of ML, the challenge of provisioning and deploying ML environments in a secure manner still remains. Typically, this is the responsibility of central IT teams in most large enterprises. In this post, we discuss how IT teams can administer, provision, and manage secure ML environments using <a href="https://aws.amazon.com/sagemaker/canvas/&quot; target="_blank" rel="noopener noreferrer">Amazon SageMaker Canvas</a>, <a href="https://aws.amazon.com/cdk/&quot; target="_blank" rel="noopener noreferrer">AWS Cloud Development Kit</a> (AWS CDK) and <a href="https://aws.amazon.com/servicecatalog/&quot; target="_blank" rel="noopener noreferrer">AWS Service Catalog</a>. The post presents a step-by-step guide for IT administrators to achieve this quickly and at scale.</p><h2>Overview of the AWS CDK and AWS Service Catalog</h2><p>The AWS CDK is an open-source software development framework to define your cloud application resources. It uses the familiarity and expressive power of programming languages for modeling your applications, while provisioning resources in a safe and repeatable manner.</p><p>AWS Service Catalog lets you centrally manage deployed IT services, applications, resources, and metadata. With AWS Service Catalog, you can create, share, organize and govern cloud resources with infrastructure as code (IaC) templates and enable fast and straightforward provisioning.</p><h2>Solution overview</h2><p>We enable provisioning of ML environments using Canvas in three steps:</p><ol><li>First, we share how you can manage a portfolio of resources necessary for the approved usage of Canvas using AWS Service Catalog.</li><li>Then, we deploy an example AWS Service Catalog portfolio for Canvas using the AWS CDK.</li><li>Finally, we demonstrate how you can provision Canvas environments on demand within minutes.</li></ol><h2>Prerequisites</h2><p>To provision ML environments with Canvas, the AWS CDK, and AWS Service Catalog, you need to do the following:</p><ol><li>Have access to the AWS account where the Service Catalog portfolio will be deployed. Make sure you have the credentials and permissions to deploy the AWS CDK stack into your account. The <a href="https://cdkworkshop.com/&quot; target="_blank" rel="noopener noreferrer">AWS CDK Workshop</a> is a helpful resource you can refer to if you need support.</li><li>We recommend following certain best practices that are highlighted through the concepts detailed in the following resources:</li><li>Clone <a href="https://github.com/aws-samples/amazon-sagemaker-canvas-service-catalog&quot; target="_blank" rel="noopener noreferrer">this GitHub repository</a> into your environment.</li></ol><h2>Provision approved ML environments with Amazon SageMaker Canvas using AWS Service Catalog</h2><p>In regulated industries and most large enterprises, you need to adhere to the requirements mandated by IT teams to provision and manage ML environments. These may include a secure, private network, data encryption, controls to allow only authorized and authenticated users such as <a href="http://aws.amazon.com/iam&quot; target="_blank" rel="noopener noreferrer">AWS Identity and Access Management</a> (IAM) for accessing solutions such as Canvas, and strict logging and monitoring for audit purposes.</p><p>As an IT administrator, you can use AWS Service Catalog to create and organize secure, reproducible ML environments with SageMaker Canvas into a product portfolio. This is managed using IaC controls that are embedded to meet the requirements mentioned before, and can be provisioned on demand within minutes. You can also maintain control of who can access this portfolio to launch products.</p><p>The following diagram illustrates this architecture.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image001.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-42829" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image001.jpg&quot; alt="" width="1285" height="852" /></a></p><h2>Example flow</h2><p>In this section, we demonstrate an example of an AWS Service Catalog portfolio with SageMaker Canvas. The portfolio consists of different aspects of the Canvas environment that are part of the Service Catalog portfolio:</p><ul><li><strong>Studio domain</strong> – Canvas is an application that runs within <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/studio-entity-status.html&quot; target="_blank" rel="noopener noreferrer">Studio domains</a>. The domain consists of an <a href="https://aws.amazon.com/efs/&quot; target="_blank" rel="noopener noreferrer">Amazon Elastic File System</a> (Amazon EFS) volume, a list of authorized users, and a range of security, application, policy, and <a href="http://aws.amazon.com/vpc&quot; target="_blank" rel="noopener noreferrer">Amazon Virtual Private Cloud</a> (VPC) configurations. An AWS account is linked to one domain per Region.</li><li><strong>Amazon S3 bucket</strong> – After the Studio domain is created, an <a href="http://aws.amazon.com/s3&quot; target="_blank" rel="noopener noreferrer">Amazon Simple Storage Service</a> (Amazon S3) bucket is provisioned for Canvas to allow importing datasets from local files, also known as local file upload. This bucket is in the customer’s account and is provisioned once.</li><li><strong>Canvas user</strong> – SageMaker Canvas is an application where you can add user profiles within the Studio domain for each Canvas user, who can proceed to import datasets, build and train ML models without writing code, and run predictions on the model.</li><li><strong>Scheduled shutdown of Canvas sessions</strong> – Canvas users can log out from the Canvas interface when they’re done with their tasks. Alternatively, <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-manage-apps.html&quot; target="_blank" rel="noopener noreferrer">administrators can shut down Canvas sessions</a> from the <a href="http://aws.amazon.com/console&quot; target="_blank" rel="noopener noreferrer">AWS Management Console</a> as part of managing the Canvas sessions. In this part of the AWS Service Catalog portfolio, an <a href="http://aws.amazon.com/lambda&quot; target="_blank" rel="noopener noreferrer">AWS Lambda</a> <a href="https://github.com/aws-samples/amazon-sagemaker-canvas-service-catalog/tree/main/lambda_images/shutdown&quot; target="_blank" rel="noopener noreferrer">function</a> is created and provisioned to automatically shut down Canvas sessions at defined scheduled intervals. This helps manage open sessions and shut them down when not in use.</li></ul><p>This example flow can be found in the <a href="https://github.com/aws-samples/amazon-sagemaker-canvas-service-catalog&quot; target="_blank" rel="noopener noreferrer">GitHub repository</a> for quick reference.</p><h2>Deploy the flow with the AWS CDK</h2><p>In this section, we deploy the flow described earlier using the AWS CDK. After it’s deployed, you can also do version tracking and manage the portfolio.</p><p>The portfolio stack can be found in <code>app.py</code> and the product stacks under the <code>products/</code> folder. You can iterate on the IAM roles, <a href="http://aws.amazon.com/kms&quot; target="_blank" rel="noopener noreferrer">AWS Key Management Service</a> (AWS KMS) keys, and VPC setup in the <code>studio_constructs/</code> folder. Before deploying the stack into your account, you can edit the following lines in <code>app.py</code> and grant portfolio access to an IAM role of your choice.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image003.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-42830" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image003.jpg&quot; alt="" width="1131" height="327" /></a></p><p>You can manage access to the portfolio for the relevant IAM users, groups, and roles. See <a href="https://docs.aws.amazon.com/servicecatalog/latest/adminguide/catalogs_portfolios_users.html&quot; target="_blank" rel="noopener noreferrer">Granting Access to Users</a> for more details.</p><h2>Deploy the portfolio into your account</h2><p>You can now run the following commands to install the AWS CDK and make sure you have the right dependencies to deploy the portfolio:</p><p>Run the following commands to deploy the portfolio into your account:</p><p>The first two commands get your account ID and current Region using the <a href="http://aws.amazon.com/cli&quot; target="_blank" rel="noopener noreferrer">AWS Command Line Interface</a> (AWS CLI) on your computer. Following this, <code>cdk bootstrap</code> and <code>cdk deploy</code> build assets locally, and deploy the stack in a few minutes.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image005.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-42831 c4" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image005.jpg&quot; alt="" width="1281" height="437" /></a></p><p>The portfolio can now be found in AWS Service Catalog, as shown in the following screenshot.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image007.jpg&quot;&gt;&lt;img class="alignnone size-full wp-image-42832 c4" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image007.jpg&quot; alt="" width="1287" height="492" /></a></p><h2>On-demand provisioning</h2><p>The products within the portfolio can be launched quickly and easily on demand from the <strong>Provisioning</strong> menu on the AWS Service Catalog console. A typical flow is to launch the Studio domain and the Canvas auto shutdown first because this is usually a one-time action. You can then add Canvas users to the domain. The domain ID and user IAM role ARN are saved in <a href="https://aws.amazon.com/systems-manager/&quot; target="_blank" rel="noopener noreferrer">AWS Systems Manager</a> and are automatically populated with the user parameters as shown in the following screenshot.</p><p><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image009-1.jpg&quot;&gt;&lt;img class="alignnone wp-image-42839 size-full c4" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/ML-9783-image009-1.jpg&quot; alt="" width="500" height="351" /></a></p><p>You can also use cost allocation tags that are attached to each user. For example, <code>UserCostCenter</code> is a sample tag where you can add the name of each user.</p><h2>Key considerations for governing ML environments using Canvas</h2><p>Now that we have provisioned and deployed an AWS Service Catalog portfolio focused on Canvas, we’d like to highlight a few considerations to govern the Canvas-based ML environments focused on the domain and the user profile.</p><p>The following are considerations regarding the Studio domain:</p><ul><li>Networking for Canvas is managed at the Studio domain level, where the domain is deployed on a private VPC subnet for secure connectivity. See <a href="https://aws.amazon.com/blogs/machine-learning/securing-amazon-sagemaker-studio-connectivity-using-a-private-vpc/&quot; target="_blank" rel="noopener noreferrer">Securing Amazon SageMaker Studio connectivity using a private VPC</a> to learn more.</li><li>A default IAM execution role is defined at the domain level. This default role is assigned to all Canvas users in the domain.</li><li>Encryption is done using AWS KMS by encrypting the EFS volume in the domain. For additional controls, you can specify your own managed key, also known as a customer managed key (CMK). See <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-at-rest.html&quot; target="_blank" rel="noopener noreferrer">Protect Data at Rest Using Encryption</a> to learn more.</li><li>The ability to upload files from your local disk is done by attaching a cross-origin resource sharing (CORS) policy to the S3 bucket used by Canvas. See <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/canvas-set-up-local-upload.html&quot; target="_blank" rel="noopener noreferrer">Give Your Users Permissions to Upload Local Files</a> to learn more.</li></ul><p>The following are considerations regarding the user profile:</p><ul><li>Authentication in Studio can be done both through single sign-on (SSO) and IAM. If you have an existing identity provider to federate users to access the console, you can assign a Studio user profile to each federated identity using IAM. See the section <strong>Assigning the policy to Studio users</strong> in <a href="https://aws.amazon.com/blogs/machine-learning/configuring-amazon-sagemaker-studio-for-teams-and-groups-with-complete-resource-isolation/&quot; target="_blank" rel="noopener noreferrer">Configuring Amazon SageMaker Studio for teams and groups with complete resource isolation</a> to learn more.</li><li>You can assign IAM execution roles to each user profile. While using Studio, a user assumes the role mapped to their user profile that overrides the default execution role. You can use this for fine-grained access controls within a team.</li><li>You can achieve isolation using attribute-based access controls (ABAC) to ensure users can only access the resources for their team. See <a href="https://aws.amazon.com/blogs/machine-learning/configuring-amazon-sagemaker-studio-for-teams-and-groups-with-complete-resource-isolation/&quot; target="_blank" rel="noopener noreferrer">Configuring Amazon SageMaker Studio for teams and groups with complete resource isolation</a> to learn more.</li><li>You can perform fine-grained cost tracking by applying cost allocation tags to user profiles.</li></ul><h2>Clean up</h2><p>In order to clean up the resources created by the AWS CDK stack above, navigate over to the AWS CloudFormation stacks page and delete the Canvas stacks. You can also run <code>cdk destroy</code> from within the repository folder, to do the same.</p><h2>Conclusion</h2><p>In this post, we shared how you can quickly and easily provision ML environments with Canvas using AWS Service Catalog and the AWS CDK. We discussed how you can create a portfolio on AWS Service Catalog, provision the portfolio, and deploy it in your account. IT administrators can use this method to deploy and manage users, sessions, and associated costs while provisioning Canvas.</p><p>Learn more about Canvas on the <a href="https://aws.amazon.com/sagemaker/canvas/&quot; target="_blank" rel="noopener noreferrer">product page</a> and the <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/canvas.html&quot; target="_blank" rel="noopener noreferrer">Developer Guide</a>. For further reading, you can learn how to <a href="https://aws.amazon.com/blogs/machine-learning/enable-business-analysts-to-access-amazon-sagemaker-canvas-without-using-the-aws-management-console-with-aws-sso/&quot; target="_blank" rel="noopener noreferrer">enable business analysts to access SageMaker Canvas using AWS SSO without the console</a>. You can also learn how <a href="https://aws.amazon.com/blogs/machine-learning/build-share-deploy-how-business-analysts-and-data-scientists-achieve-faster-time-to-market-using-no-code-ml-and-amazon-sagemaker-canvas/&quot; target="_blank" rel="noopener noreferrer">business analysts and data scientists can collaborate faster using Canvas and Studio</a>.</p><h3><strong>About the Authors</strong></h3><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2021/12/20/Davide-Gallitelli.png&quot;&gt;&lt;img class="size-full wp-image-31919 alignleft" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2021/12/20/Davide-Gallitelli.png&quot; alt="" width="100" height="150" /></a>Davide Gallitelli</strong> is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in Brussels and works closely with customers throughout Benelux. He has been a developer since he was very young, starting to code at the age of 7. He started learning AI/ML at university, and has fallen in love with it since then.</p><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2020/12/19/Sofian-Hamiti.jpg&quot;&gt;&lt;img class="size-full wp-image-20105 alignleft" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2020/12/19/Sofian-Hamiti.jpg&quot; alt="" width="100" height="133" /></a>Sofian Hamiti</strong> is an AI/ML specialist Solutions Architect at AWS. He helps customers across industries accelerate their AI/ML journey by helping them build and operationalize end-to-end machine learning solutions.</p><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/image015-2.jpg&quot;&gt;&lt;img class="wp-image-42838 size-full alignleft" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/19/image015-2.jpg&quot; alt="" width="100" height="133" /></a>Shyam Srinivasan</strong> is a Principal Product Manager on the AWS AI/ML team, leading product management for Amazon SageMaker Canvas. Shyam cares about making the world a better place through technology and is passionate about how AI and ML can be a catalyst in this journey.</p><p class="c5"><strong><a href="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/27/Avi-Patel.png&quot;&gt;&lt;img class="size-full wp-image-43379 alignleft" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/27/Avi-Patel.png&quot; alt="" width="100" height="133" /></a>Avi Patel</strong> works as a software engineer on the Amazon SageMaker Canvas team. His background consists of working full stack with a frontend focus. In his spare time, he likes to contribute to open source projects in the crypto space and learn about new DeFi protocols.</p><p class="c5"><strong><img class="size-full wp-image-43387 alignleft" src="https://d2908q01vomqb2.cloudfront.net/f1f836cb4ea6efb2a0b1b99f41ad8b103eff4b59/2022/09/27/heyjared.jpg&quot; alt="" width="100" height="125" />Jared Heywood</strong> is a Senior Business Development Manager at AWS. He is a global AI/ML specialist helping customers with no-code machine learning. He has worked in the AutoML space for the past 5 years and launched products at Amazon like Amazon SageMaker JumpStart and Amazon SageMaker Canvas.</p></section>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

机器学习 Amazon SageMaker Canvas ML环境部署 AWS CDK AWS Service Catalog
相关文章