AWS Machine Learning Blog 03月21日
Unleashing the multimodal power of Amazon Bedrock Data Automation to transform unstructured data into actionable insights
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Gartner预测生成式AI解决方案将多模态化,企业面临数据管理难题。亚马逊Bedrock Data Automation可从多模态内容中自动生成有用见解,具有多种优势,适用于多个行业的关键用例,能加速AI应用并提升效率。

亚马逊Bedrock Data Automation能自动处理多模态内容,降低复杂性

该工具增强透明度,遵循安全合规标准,可控制数据提取方式

它提供统一API,优化成本,实现跨区域推断,定价透明可预测

适用于多种行业关键用例,如智能文档处理等,提升效率

Gartner predicts that “by 2027, 40% of generative AI solutions will be multimodal (text, image, audio and video) by 2027, up from 1% in 2023.”

The McKinsey 2023 State of AI Report identifies data management as a major obstacle to AI adoption and scaling. Enterprises generate massive volumes of unstructured data, from legal contracts to customer interactions, yet extracting meaningful insights remains a challenge. Traditionally, transforming raw data into actionable intelligence has demanded significant engineering effort. It often requires managing multiple machine learning (ML) models, designing complex workflows, and integrating diverse data sources into production-ready formats.

The result is expensive, brittle workflows that demand constant maintenance and engineering resources. In a world where—according to Gartner—over 80% of enterprise data is unstructured, enterprises need a better way to extract meaningful information to fuel innovation.

Today, we’re excited to announce the general availability of Amazon Bedrock Data Automation, a powerful, fully managed feature within Amazon Bedrock that automate the generation of useful insights from unstructured multimodal content such as documents, images, audio, and video for your AI-powered applications. It enables organizations to extract valuable information from multimodal content unlocking the full potential of their data without requiring deep AI expertise or managing complex multimodal ML pipelines. With Amazon Bedrock Data Automation, enterprises can accelerate AI adoption and develop solutions that are secure, scalable, and responsible.

The benefits of using Amazon Bedrock Data Automation

Amazon Bedrock Data Automation provides a single, unified API that automates the processing of unstructured multi-modal content, minimizing the complexity of orchestrating multiple models, fine-tuning prompts, and stitching outputs together. It helps ensure high accuracy and cost efficiency while significantly lowering processing costs.

Built with responsible AI, Amazon Bedrock Data Automation enhances transparency with visual grounding and confidence scores, allowing outputs to be validated before integration into mission-critical workflows. It adheres to enterprise-grade security and compliance standards, enabling you to deploy AI solutions with confidence. It also enables you to define when data should be extracted as-is and when it should be inferred, giving complete control over the process.

Cross-Region inference enables seamless management of unplanned traffic bursts by using compute across different AWS Regions. Amazon Bedrock Data Automation optimizes for available AWS Regional capacity by automatically routing across regions within the same geographic area to maximize throughput at no additional cost. For example, a request made in the US stays within Regions in the US. Amazon Bedrock Data Automation is currently available in US West (Oregon) and US East (N. Virginia) AWS Regions helping to ensure seamless request routing and enhanced reliability.  Amazon Bedrock Data Automation is expanding to additional Regions, so be sure to check the documentation for the latest updates.

Amazon Bedrock Data Automation offers transparent and predictable pricing based on the modality of processed content and the type of output used (standard vs custom output). Pay according to the number of pages, quantity of images, and duration of audio and video files. This straightforward pricing model provides easier cost calculation compared to token-based pricing model.

Use cases for Amazon Bedrock Data Automation

Key use cases such as intelligent document processingmedia asset analysis and monetization, speech analytics, search and discovery, and agent-driven operations highlight how Amazon Bedrock Data Automation enhances innovation, efficiency, and data-driven decision-making across industries.

Intelligent document processing

According to Fortune Business Insights, the intelligent document processing industry is projected to grow from USD 10.57 billion in 2025 to USD 66.68 billion by 2032 with a CAGR of 30.1 %. IDP is powering critical workflows across industries and enabling businesses to scale with speed and accuracy. Financial institutions use IDP to automate tax forms and fraud detection, while healthcare providers streamline claims processing and medical record digitization. Legal teams accelerate contract analysis and compliance reviews, and in oil and gas, IDP enhances safety reporting. Manufacturers and retailers optimize supply chain and invoice processing, helping to ensure seamless operations. In the public sector, IDP improves citizen services, legislative document management, and compliance tracking. As businesses strive for greater automation, IDP is no longer an option, it’s a necessity for cost reduction, operational efficiency, and data-driven decision-making.

Let’s explore a real-world use case showcasing how Amazon Bedrock Data Automation enhances efficiency in loan processing.

Loan processing is a complex, multi-step process that involves document verification, credit assessments, policy compliance checks, and approval workflows, requiring precision and efficiency at every stage. Loan processing with traditional AWS AI services is shown in the following figure.

As shown in the preceding figure, loan processing is a multi-step workflow that involves handling diverse document types, managing model outputs, and stitching results across multiple services. Traditionally, documents from portals, email, or scans are stored in Amazon Simple Storage Service (Amazon S3), requiring custom logic to split multi-document packages. Next, Amazon Comprehend or custom classifiers categorize them into types such as W2s, bank statements, and closing disclosures, while Amazon Textract extracts key details. Additional processing is needed to standardize formats, manage JSON outputs, and align data fields, often requiring manual integration and multiple API calls. In some cases, foundation models (FMs) generate document summaries, adding further complexity. Additionally, human-in-the-loop verification may be required for low-threshold outputs.

With Amazon Bedrock Data Automation, this entire process is now simplified into a single unified API call. It automates document classification, data extraction, validation, and structuring, removing the need for manual stitching, API orchestration, and custom integration efforts, significantly reducing complexity and accelerating loan processing workflows as shown in the following figure.

As shown in the preceding figure, when using Amazon Bedrock Data Automation, loan packages from third-party systems, portals, email, or scanned documents are stored in Amazon S3, where Amazon Bedrock Data Automation automates document splitting and processing, removing the need for custom logic. After the loan packages are ingested, Amazon Bedrock Data Automation classifies documents such W2s, bank statements, and closing disclosures in a single step, alleviating the need for separate classifier model calls. Amazon Bedrock Data Automation then extracts key information based on the customer requirement, capturing critical details such as employer information from W2s, transaction history from bank statements, and loan terms from closing disclosures.

Unlike traditional workflows that require manual data normalization, Amazon Bedrock Data Automation automatically standardizes extracted data, helping to ensure consistent date formats, currency values, and field names without additional processing based on the customer provided output schema. Moreover, Amazon Bedrock Data Automation enhances compliance and accuracy by providing summarized outputs, bounding boxes for extracted fields, and confidence scores, delivering structured, validated, and ready-to-use data for downstream applications with minimal effort.

In summary, Amazon Bedrock Data Automation enables financial institutions to seamlessly process loan documents from ingestion to final output through a single unified API call, eliminating the need for multiple independent steps.

While this example highlights financial services, the same principles apply across industries to streamline complex document processing workflows. Built for scale, security, and transparency,  Amazon Bedrock Data Automation adheres to enterprise-grade compliance standards, providing robust data protection. With visual grounding, confidence scores, and seamless integration into knowledge bases, it powers Retrieval Augmented Generation (RAG)-driven document retrieval and completes the deployment of production-ready AI workflows in days, not months.

It also offers flexibility in data extraction by supporting both explicit and implicit extractions. Explicit extraction is used for clearly stated information, such as names, dates, or specific values, while implicit extraction infers insights that aren’t directly stated but can be derived through context and reasoning. This ability to toggle between extraction types enables more comprehensive and nuanced data processing across various document types.

This is achieved through responsible AI, with Amazon Bedrock Data Automation passing every process through a responsible AI model to help ensure fairness, accuracy, and compliance in document automation.

By automating document classification, extraction, and normalization, it not only accelerates document processing, it also enhances downstream applications, such as knowledge management and intelligent search. With structured, validated data readily available, organizations can unlock deeper insights and improve decision-making.

This seamless integration extends to efficient document search and retrieval, transforming business operations by enabling quick access to critical information across vast repositories. By converting unstructured document collections into searchable knowledge bases, organizations can seamlessly find, analyze, and use their data. This is particularly valuable for industries handling large document volumes, where rapid access to specific information is crucial. Legal teams can efficiently search through case files, healthcare providers can retrieve patient histories and research papers, and government agencies can manage legislative records and policy documents. Powered by Amazon Bedrock Data Automation and Amazon Bedrock Knowledge Bases, this integration streamlines investment research, regulatory filings, clinical protocols, and public sector record management, significantly improving efficiency across industries.

The following figure shows how Amazon Bedrock Data Automation seamlessly integrates with Amazon Bedrock Knowledge Bases to extract insights from unstructured datasets and ingest them into a vector database for efficient retrieval. This integration enables organizations to unlock valuable knowledge from their data, making it accessible for downstream applications. By using these structured insights, businesses can build generative AI applications, such as assistants that dynamically answer questions and provide context-aware responses based on the extracted information. This approach enhances knowledge retrieval, accelerates decision-making, and enables more intelligent, AI-driven interactions.

The preceding architecture diagram showcases a pipeline for processing and retrieving insights from multimodal content using Amazon Bedrock Data Automation and Amazon Bedrock Knowledge Bases. Unstructured data, such as documents, images, videos, and audio, is first ingested into an Amazon S3 bucket. Amazon Bedrock Data Automation then processes this content, extracting key insights and transforming it for further use. The processed data is stored in Amazon Bedrock Knowledge Bases, where an embedding model converts it into vector representations, which are then stored in a vector database for efficient semantic search. Amazon API Gateway (WebSocket API) facilitates real-time interactions, enabling users to query the knowledge base dynamically via a chatbot or other interfaces. This architecture enhances automated data processing, efficient retrieval, and seamless real-time access to insights.

Beyond intelligent search and retrieval, Amazon Bedrock Data Automation enables organizations to automate complex decision-making processes, providing greater accuracy and compliance in document-driven workflows. By using structured data, businesses can move beyond simple document processing to intelligent, policy-aware automation.

Amazon Bedrock Data Automation can also be used with Amazon Bedrock Agents to take the next step in automation. Going beyond traditional IDP, this approach enables autonomous workflows that assist knowledge workers and streamline decision-making. For example, in insurance claims processing, agents validate claims against policy documents; while in loan processing, they assess mortgage applications against underwriting policies. With multi-agent workflows, policy validation, automated decision support, and document generation, this approach enhances efficiency, accuracy, and compliance across industries.

Similarly, Amazon Bedrock Data Automation is simplifying media and entertainment use cases, seamlessly integrating workflows through its unified API. Let’s take a closer look at how it’s driving this transformation

Media asset analysis and monetization

Companies in media and entertainment (M&E), advertising, gaming, and education own vast digital assets, such as videos, images, and audio files, and require efficient ways to analyze them. Gaining insights from these assets enables better indexing, deeper analysis, and supports monetization and compliance efforts.

The image and video modalities of Amazon Bedrock Data Automation provide advanced features for efficient extraction and analysis.

The customized approach to extracting and analyzing video content involves a sophisticated process that gathers information from both the visual and audio components of the video, making it complex to build and manage.

As shown in the preceding figure, a customized video analysis pipeline involves sampling image frames from the visual portion of the video and applying both specialized and FMs to extract information, which is then aggregated at the shot level. It also transcribes the audio into text and combines both visual and audio data for chapter level analysis. Additionally, large language model (LLM)-based analysis is applied to derive further insights, such as video summaries and classifications. Finally, the data is stored in a database for downstream applications to consume.

Media video analysis with Amazon Bedrock Data Automation now simplifies this workflow into a single unified API call, minimizing complexity and reducing integration effort, as shown in the following figure.

Customers can use Amazon Bedrock Data Automation to support popular media analysis use cases such as:

Amazon Bedrock Data Automation automates video, image, and audio analysis, making DAM more scalable, efficient and intelligent.

Amazon Bedrock Data Automation automates content analysis across video, audio, and images, eliminating complex workflows. It extracts scene summaries, audio segments, and IAB taxonomies to power video ads solution, improving contextual ad placement and improve ad campaign performance.

Amazon Bedrock Data Automation streamlines compliance by using AI-driven content moderation to analyze both the visual and audio components of media. This enables users to define and apply customized policies to evaluate content against their specific compliance requirements.

Intelligent speech analytics

Amazon Bedrock Data Automation is used in intelligent speech analytics to derive insights from audio data across multiple industries with speed and accuracy. Financial institutions rely on intelligent speech analytics to monitor call centers for compliance and detect potential fraud, while healthcare providers use it to capture patient interactions and optimize telehealth communications. In retail and hospitality, speech analytics drives customer engagement by uncovering insights from live feedback and recorded interactions. With the exponential growth of voice data, intelligent speech analytics is no longer a luxury—it’s a vital tool for reducing costs, improving efficiency, and driving smarter decision-making.

Customer service – AI-driven call analytics for better customer experience

Businesses can analyze call recordings at scale to gain actionable insights into customer sentiment, compliance, and service quality. Contact centers can use Amazon Bedrock Data Automation to:

A traditional call analytics approach is shown in the following figure.

Processing customer service call recordings involves multiple steps, from audio capture to advanced AI-driven analysis as highlighted below:

Unified, API-drove speech analytics with Amazon Bedrock Data Automation

The following figure shows customer service call analytics using Amazon Bedrock Data Automation-power intelligent speech analytics.

Optimizing customer service call analysis requires a seamless, automated pipeline that efficiently ingests, processes, and extracts insights from audio recordings as mentioned below:

By consolidating multiple steps into a single API call, customer service centers benefit from faster processing, reduced error rates, and significantly lower integration complexity. This streamlined approach enables real-time monitoring and proactive agent coaching, ultimately driving improved customer experience and operational agility.

Before the availability of Amazon Bedrock Data Automation for intelligent speech analytics, customer service call analysis was a fragmented, multi-step process that required juggling various tools and models. Now, with the unified API of Amazon Bedrock Data Automation, organizations can quickly transform raw voice data into actionable insights—cutting through complexity, reducing costs, and empowering teams to enhance service quality and compliance.

When to choose Amazon Bedrock Data Automation instead of traditional AI/ML services

You should choose Amazon Bedrock Data Automation when you need a simple, API-driven solution for multi-modal content processing without the complexity of managing and orchestrating across multiple models or prompt engineering. With a single API call, Amazon Bedrock Data Automation seamlessly handles asset splitting, classification, information extraction, visual grounding, and confidence scoring, eliminating the need for manual orchestration.

On the other hand, the core capabilities of Amazon Bedrock are ideal if you require full control over models and workflows to tailor solutions to your organization’s specific business needs. Developers can use Amazon Bedrock to select FMs based on price-performance, fine-tune prompt engineering for data extraction, train custom classification models, implement responsible AI guardrails, and build an orchestration pipeline to provide consistent output.

Amazon Bedrock Data Automation streamlines multi-modal processing, while Amazon Bedrock offers building blocks for deeper customization and control.

Conclusion

Amazon Bedrock Data Automation provides enterprises with scalability, security, and transparency; enabling seamless processing of unstructured data with confidence. Designed for rapid deployment, it helps developers transition from prototype to production in days, accelerating time-to-value while maintaining cost efficiency. Start using Amazon Bedrock Data Automation today and unlock the full potential of your unstructured data. For solution guidance, see Guidance for Multimodal Data Processing with Bedrock Data Automation.


About the Authors

Wrick Talukdar is a Tech Lead – Generative AI Specialist focused on Intelligent Document Processing. He leads machine learning initiatives and projects across business domains, leveraging multimodal AI, generative models, computer vision, and natural language processing. He speaks at conferences such as AWS re:Invent, IEEE, Consumer Technology Society(CTSoc), YouTube webinars, and other industry conferences like CERAWEEK and ADIPEC. In his free time, he enjoys writing and birding photography.

Lana Zhang is a Senior Solutions Architect at AWS World Wide Specialist Organization AI Services team, specializing in AI and generative AI with a focus on use cases including content moderation and media analysis. With her expertise, she is dedicated to promoting AWS AI and generative AI solutions, demonstrating how generative AI can transform classic use cases with advanced business value. She assists customers in transforming their business solutions across diverse industries, including social media, gaming, e-commerce, media, advertising, and marketing.

Julia Hu is a Specialist Solutions Architect who helps AWS customers and partners build generative AI solutions using Amazon Q Business on AWS. Julia has over 4 years of experience developing solutions for customers adopting AWS services on the forefront of cloud technology.

Keith Mascarenhas leads worldwide GTM strategy for Generative AI at AWS, developing enterprise use cases and adoption frameworks for Amazon Bedrock. Prior to this, he drove AI/ML solutions and product growth at AWS, and held key roles in Business Development, Solution Consulting and Architecture across Analytics, CX and Information Security.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

亚马逊Bedrock Data Automation 多模态内容 数据管理 行业应用 效率提升
相关文章