AWS Blogs 07月16日 07:58
Streamline the path from data to insights with new Amazon SageMaker Catalog capabilities
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

亚马逊 SageMaker 推出三项新功能,整合 Amazon QuickSight、支持 Amazon S3 通用存储桶和自动数据接入湖屋架构,以简化数据工作流并提升洞察力。这些功能通过统一平台实现结构化与非结构化数据的无缝管理和可视化,同时保持一致的治理和访问控制,帮助组织充分利用数据投资并加速决策制定。

💡 Amazon SageMaker 与 Amazon QuickSight 集成:直接在 SageMaker 统一工作室中启动 QuickSight,利用项目数据构建仪表板,并发布到 SageMaker 目录,实现跨组织共享和治理。

📂 Amazon S3 通用存储桶集成:支持 SageMaker 目录中的 S3 通用存储桶,通过 S3 访问权限 granular 控制数据访问,提升数据可发现性和协作效率。

🔄 自动数据接入湖屋:自动将 AWS Glue 数据目录中的现有数据集接入 SageMaker 目录,无需手动设置,简化数据管理和治理流程。

<section class="blog-post-content lb-rtxt"><table id="amazon-polly-audio-table"><tbody><tr><td id="amazon-polly-audio-tab"><p></p></td></tr></tbody></table><p>Modern organizations manage data across multiple disconnected systems—structured databases, unstructured files, and separate visualization tools—creating barriers that slow analytics workflows and limit insight generation. Separate visualization platforms often create barriers that prevent teams from extracting comprehensive business insights.</p><p>These disconnected workflows prevent your organizations from maximizing your data investments, creating delays in decision making and missed opportunities for comprehensive analysis that combines multiple data types.</p><p>Starting today, you can use three new capabilities in <a href="https://aws.amazon.com/sagemaker/&quot;&gt;Amazon SageMaker</a> to accelerate your path from raw data to actionable insights:</p><ul><li><strong>Amazon QuickSight integration</strong> – Launch <a href="https://aws.amazon.com/quicksight/&quot;&gt;Amazon QuickSight</a> directly from Amazon SageMaker Unified Studio to build dashboards using your project data, then publish them to the <a href="https://aws.amazon.com/sagemaker/catalog/&quot;&gt;Amazon SageMaker Catalog</a> for broader discovery and sharing across your organization.</li><li><strong>Amazon SageMaker adds support for Amazon S3 general purpose buckets and Amazon S3 Access Grants in SageMaker Catalog</strong>– Make data stored in <a href="https://aws.amazon.com/s3/&quot;&gt;Amazon S3</a> general purpose buckets easier for teams to find, access, and collaborate on all types of data including unstructured data, while maintaining fine-grained access control using Amazon S3 Access Grants.</li><li><strong>Automatic data onboarding from your lakehouse</strong> – Automatic onboarding of existing <a href="https://aws.amazon.com/glue/&quot;&gt;AWS Glue</a> Data Catalog (GDC) datasets from the lakehouse architecture into SageMaker Catalog, without manual setup.</li></ul><p>These new SageMaker capabilities address the complete data lifecycle within a unified and governed experience. You get automatic onboarding of existing structured data from your lakehouse, seamless cataloging of unstructured data content in Amazon S3, and streamlined visualization through QuickSight—all with consistent governance and access controls.</p><p>Let’s take a closer look at each capability.</p><p><strong>Amazon SageMaker and Amazon QuickSight Integration<br /></strong>With this integration, you can build dashboards in Amazon QuickSight using data from your Amazon SageMaker projects. When you launch QuickSight from <a href="https://aws.amazon.com/sagemaker/unified-studio/&quot;&gt;Amazon SageMaker Unified Studio</a>, Amazon SageMaker automatically creates the QuickSight dataset and organizes it in a secured folder accessible only to project members.</p><p>Furthermore, the dashboards you build stay within this folder and automatically appear as assets in your SageMaker project, where you can publish them to the SageMaker Catalog and share them with users or groups in your corporate directory. This keeps your dashboards organized, discoverable, and governed within SageMaker Unified Studio.</p><p>To use this integration, both your Amazon SageMaker Unified Studio domain and QuickSight account must be integrated with <a href="https://aws.amazon.com/iam/identity-center/&quot;&gt;AWS IAM Identity Center</a> using the same IAM Identity Center instance. Additionally, your QuickSight account must exist in the same AWS account where you want to enable the QuickSight blueprint. You can learn more about the prerequisites on <a href="https://docs.aws.amazon.com/sagemaker-unified-studio/latest/adminguide/amazon-quicksight.html&quot;&gt;Documentation page</a>. </p><p>After these prerequisites are met, you can enable the blueprint for Amazon QuickSight by navigating to the Amazon SageMaker console and choosing the <strong>Blueprints</strong> tab. Then find <strong>Amazon QuickSight</strong> and follow the instructions.</p><p><img class="aligncenter size-full wp-image-98000" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-quicksight-01.png&quot; alt="" width="2043" height="1001" /></p><p>You also need to configure your <strong>SQL analytics</strong> project profile to include Amazon QuickSight in <strong>Add blueprint deployment settings</strong>.</p><p><img class="aligncenter size-full wp-image-98005" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-quicksight-02.png&quot; alt="" width="1395" height="737" /></p><p>To learn more on onboarding setup, refer to the <a href="https://docs.aws.amazon.com/sagemaker-unified-studio/latest/adminguide/amazon-quicksight.html&quot;&gt;Documentation page</a>.</p><p>Then, when you create a new project, you need to use the <strong>SQL analytics</strong> profile.</p><p><img class="aligncenter size-full wp-image-98006" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-quicksight-03.png&quot; alt="" width="1330" height="1186" /></p><p>With your project created, you can start building visualizations with QuickSight. You can navigate to the <strong>Data</strong> tab, select the table or view to visualize, and choose <strong>Open in QuickSight</strong> under <strong>Actions</strong>.</p><p><img class="aligncenter size-full wp-image-98007" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-quicksight-04.png&quot; alt="" width="2538" height="600" /></p><p>This will redirect you to the Amazon QuickSight <strong>transactions</strong> dataset page and you can choose <strong>USE IN ANALYSIS</strong> to begin exploring the data.</p><p><img class="aligncenter size-full wp-image-98008" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-quicksight-05.png&quot; alt="" width="2538" height="666" /></p><p>When you create a project with the QuickSight blueprint, SageMaker Unified Studio automatically provisions a restricted QuickSight folder per project where SageMaker scopes all new assets—analyses, datasets, and dashboards. The integration maintains real-time folder permission sync, keeping QuickSight folder access permissions aligned with project membership.</p><p><strong>Amazon Simple Storage Service (S3) general purpose buckets integration<br /></strong>Starting today, SageMaker adds support for S3 general purpose buckets in SageMaker Catalog to increase discoverability and allows granular permissions through S3 Access Grants, enabling users to govern data, including sharing and managing permissions. Data consumers, such as data scientists, engineers, and business analysts, can now discover and access S3 assets through SageMaker Catalog. This expansion also enables data producers to govern security controls on any S3 data asset through a single interface.</p><p>To use this integration, you need appropriate S3 general purpose bucket permissions, and your SageMaker Unified Studio projects must have access to the S3 buckets containing your data. Learn more about prerequisites on <a href="https://docs.aws.amazon.com/sagemaker-unified-studio/latest/userguide/data-s3.html&quot;&gt;Amazon S3 data in Amazon SageMaker Unified Studio</a> Documentation page.</p><p>You can add a connection to an existing S3 bucket.</p><p><img class="aligncenter size-full wp-image-97993" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-s3-unstructured-00.png&quot; alt="" width="2374" height="1726" /></p><p>When it’s connected, you can browse accessible folders and create discoverable assets by choosing on the bucket or a folder and selecting <strong>Publish to Catalog</strong>.</p><p><img class="aligncenter size-full wp-image-97994" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-s3-unstructured-03.png&quot; alt="" width="3028" height="1472" /></p><p>This action creates a SageMaker Catalog asset of type “S3 Object Collection” and opens an asset details page where users can augment business context to improve search and discoverability. Once published, data consumers can discover and subscribe to these cataloged assets. When data consumers subscribe to “S3 Object Collection” assets, SageMaker Catalog automatically grants access using S3 Access Grants upon approval, enabling cross-team collaboration while ensuring only the right users have the right access.</p><p><img class="aligncenter size-full wp-image-97997" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-s3-unstructured-04.png&quot; alt="" width="3501" height="1724" /></p><p>When you have access, now you can process your unstructured data in Amazon SageMaker Jupyter notebook. Following screenshot is an example to process image in medical use case.</p><p><img class="aligncenter size-full wp-image-98310" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/15/2025-07-news-sagemaker-s3-unstructured-rev-01.png&quot; alt="" width="3456" height="1744" /></p><p>If you have structured data, you can query your data using <a href="https://aws.amazon.com/athena/&quot;&gt;Amazon Athena</a> or process using Spark in notebooks.</p><p><img class="aligncenter size-full wp-image-97998" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/09/2025-07-news-sagemaker-s3-unstructured-01.png&quot; alt="" width="3526" height="1586" /></p><p>With this access granted through S3 Access Grants, you can seamlessly incorporate S3 data into my workflows—analyzing it in notebooks, combining it with structured data in the lakehouse and <a href="https://aws.amazon.com/redshift/&quot;&gt;Amazon Redshift</a> for comprehensive analytics. You can access unstructured data such as documents, images in JupyterLab notebooks to train ML models, or generate queryable insights.</p><p><strong>Automatic data onboarding from your lakehouse<br /></strong>This integration automatically onboards all your lakehouse datasets into SageMaker Catalog. The key benefit for you is to bring AWS Glue Data Catalog (GDC) datasets into SageMaker Catalog, eliminating manual setup for cataloging, sharing, and governing them centrally.</p><p>This integration requires an existing lakehouse setup with Data Catalog containing your structured datasets.</p><p>When you set up a SageMaker domain, SageMaker Catalog automatically ingests metadata from all lakehouse databases and tables. This means you can immediately explore and use these datasets from within SageMaker Unified Studio without any configuration.</p><p><img class="aligncenter size-full wp-image-98065" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/07/10/2025-07-news-sagemaker-lakehouse-rev-1.png&quot; alt="" width="2652" height="1489" /></p><p>The integration helps you to start managing, governing, and consuming these assets from within SageMaker Unified Studio, applying the same governance policies and access controls you can use for other data types while unifying technical and business metadata.</p><p><strong>Additional things to know<br /></strong>Here are a couple of things to note:</p><ul><li><strong>Availability</strong> – These integrations are available in all commercial AWS Regions where Amazon SageMaker is supported.</li><li><strong>Pricing</strong> – Standard SageMaker Unified Studio, QuickSight, and Amazon S3 pricing applies. No additional charges for the integrations themselves.</li><li><strong>Documentation</strong> – You can find complete setup guides in the <a href="https://docs.aws.amazon.com/sagemaker/latest/dg/unified-studio.html&quot;&gt;SageMaker Unified Studio Documentation</a>.</li></ul><p>Get started with these new integrations through the <a href="https://console.aws.amazon.com/sagemaker/unified-studio&quot;&gt;Amazon SageMaker Unified Studio console</a>.</p><p>Happy building!<br />— <a href="https://www.linkedin.com/in/donnieprakoso&quot;&gt;Donnie&lt;/a&gt;&lt;/p&gt;&lt;/section&gt;&lt;aside id="Comments" class="blog-comments"><div data-lb-comp="aws-blog:cosmic-comments" data-env="prod" data-content-id="edf4899f-40b0-42fa-ba7a-d9a6dce5b6c3" data-title="Streamline the path from data to insights with new Amazon SageMaker Catalog capabilities" data-url="https://aws.amazon.com/blogs/aws/streamline-the-path-from-data-to-insights-with-new-amazon-sagemaker-capabilities/&quot;&gt;&lt;p data-failed-message="Comments cannot be loaded… Please refresh and try again.">Loading comments…</p></div></aside>

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

亚马逊 SageMaker 数据分析 QuickSight S3 湖屋架构
相关文章