AWS Machine Learning Blog 2024年10月18日
Summarize call transcriptions securely with Amazon Transcribe and Amazon Bedrock Guardrails
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

在现代商业环境中,会议、访谈和客户互动产生的音频记录对于捕捉宝贵信息至关重要。手动转录和总结这些记录既耗时又乏味。幸运的是,生成式 AI 和自动语音识别 (ASR) 的进步为自动化解决方案铺平了道路,可以简化此过程。本文将介绍如何使用 Amazon Transcribe 和 Amazon Bedrock 来实现语音记录的实时转录、摘要和敏感信息屏蔽。

🚀 **利用 Amazon Transcribe 实现实时转录** Amazon Transcribe 是一款基于云的语音转文字服务,能够将音频记录转换为文本,并支持多位说话人识别和分割。它使用先进的语音识别算法和机器学习模型,能够准确地转录音频,即使在存在口音、背景噪音和其他挑战的情况下也能保持高精度。

🤖 **使用 Amazon Bedrock 进行摘要和敏感信息屏蔽** Amazon Bedrock 是一种全托管服务,提供来自领先模型提供商(如 AI21 Labs、Anthropic、Cohere、Meta、Stability AI、Mistral AI 和 Amazon)的高性能基础模型 (FM),并通过单个 API 提供构建生成式 AI 应用程序所需的安全、隐私和负责任 AI 功能。通过 Amazon Bedrock Guardrails,可以屏蔽敏感信息,例如在生成的通话转录摘要中找到的 PII。

⚙️ **使用 AWS Step Functions 协调工作流程** AWS Step Functions 是一种无服务器工作流协调服务,它可以将不同的 AWS 服务串联起来,实现复杂的任务自动化。在这个解决方案中,Step Functions 协调 Amazon Transcribe 和 Amazon Bedrock 的工作流程,确保音频记录的转录、摘要和敏感信息屏蔽过程顺利完成。

📨 **使用 Amazon SNS 将摘要发送给指定收件人** Amazon SNS 是一种完全托管的发布/订阅消息传递服务,可以将消息可靠地发送到各种目标,例如电子邮件、短信和移动推送通知。在该解决方案中,Amazon SNS 用于将经过摘要和敏感信息屏蔽的转录文本发送给指定收件人,确保他们能够及时收到关键信息。

📊 **快速获取见解,保护客户隐私** 该解决方案通过自动化语音记录的转录、摘要和敏感信息屏蔽过程,帮助企业快速获取通话趋势和其他重要信息,同时保护客户隐私。这使得分析师能够更快速地了解客户需求,提高客户服务质量。

Given the volume of meetings, interviews, and customer interactions in modern business environments, audio recordings play a crucial role in capturing valuable information. Manually transcribing and summarizing these recordings can be a time-consuming and tedious task. Fortunately, advancements in generative AI and automatic speech recognition (ASR) have paved the way for automated solutions that can streamline this process.

Customer service representatives receive a high volume of calls each day. Previously, calls were recorded and manually reviewed later for compliance, regulations, and company policies. Call recordings had to be transcribed, summarized, and then redacted for personal identifiable information (PII) before analyzing calls, resulting in delayed access to insights.

Redacting PII is a critical practice in security for several reasons. Maintaining the privacy and protection of individuals’ personal information is not only a matter of ethical responsibility, but also a legal requirement. In this post, we show you how to use Amazon Transcribe to get near real-time transcriptions of calls sent to Amazon Bedrock for summarization and sensitive data redaction. We’ll walk through an architecture that uses AWS Step Functions to orchestrate the process, providing seamless integration and efficient processing

Amazon Bedrock is a fully managed service that offers a choice of high-performing foundation models (FMs) from leading model providers such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, Mistral AI, and Amazon through a single API, along with a broad set of capabilities you need to build generative AI applications with security, privacy, and responsible AI. You can use  Amazon Bedrock Guardrails to redact sensitive information such as PII found in the generated call transcription summaries. Clean, summarized transcripts are then sent to analysts. This provides quicker access to call trends while protecting customer privacy.

Solution overview

The architecture of this solution is designed to be scalable, efficient, and compliant with privacy regulations. It includes the following key components:

    Recording – An audio file, such as a meeting or support call, to be transcribed and summarized Step Functions workflow – Coordinates the transcription and summarization process Amazon Transcribe – Converts audio recordings into text Amazon Bedrock – Summarizes the transcription and removes PII Amazon SNS – Delivers the summary to the designated recipient Recipient – Receives the summarized, PII-redacted transcript

The following diagram shows the architecture overflow –

The workflow orchestrated by Step Functions is as follows:

    An audio recording is provided as an input to the Step Functions workflow. This could be done manually or automatically depending on the specific use case and integration requirements. The workflow invokes Amazon Transcribe, which converts the multi-speaker audio recording into a textual, speaker-partition transcription. Amazon Transcribe uses advanced speech recognition algorithms and machine learning (ML) models to accurately partition speakers and transcribe the audio, handling various accents, background noise, and other challenges. The transcription output from Amazon Transcribe is then passed to Anthropic’s Claude 3 Haiku model on Amazon Bedrock through AWS Lambda. This model was chosen because it has relatively lower latency and cost than other models. The model first summarizes the transcript according to its summary instructions, and then the summarized output (the model response) is evaluated by Amazon Bedrock Guardrails to redact PII. To learn how it blocks harmful content, refer to How Amazon Bedrock Guardrails works. The instructions and transcript are both passed to the model as context. The output from Amazon Bedrock is stored in Amazon Simple Storage Service (Amazon S3) and sent to the designated recipient using Amazon Simple Notification Service (Amazon SNS). Amazon SNS supports various delivery channels, including email, SMS, and mobile push notifications, making sure that the summary reaches the intended recipient in a timely and reliable manner

The recipient can then review the concise summary, quickly grasping the key points and insights from the original audio recording. Additionally, sensitive information has been redacted, maintaining privacy and compliance with relevant regulations.

The following diagram shows the Step Functions workflow –

Prerequisites

Follow these steps before starting:

    Amazon Bedrock users need to request access to models before they’re available for use. This is a one-time action. For this solution, you need to enable access to Anthropic’s Claude 3 Haiku model on Amazon Bedrock. For more information, refer to Access Amazon Bedrock foundation models. Deployment, as described below, is currently supported only in the US West (Oregon) us-west-2 AWS Region. Users may explore other models if desired. You might need some customizations to deploy to alternative Regions with different model availability (such as us-east-1, which hosts Anthropic’s Claude 3.5 Sonnet). Make sure you consider model quality, speed, and cost tradeoffs before choosing a model. Create a guardrail for PII redaction. Configure filters to block or mask sensitive information. This option can be found on the Amazon Bedrock console on the Add sensitive information filters page when creating a guardrail. To learn how to configure filters for other use cases, refer to Remove PII from conversations by using sensitive information filters.

Deploy solution resources

To deploy the solution, download an AWS CloudFormation template to automatically provision the necessary resources in your AWS account. The template sets up the following components:

By using this template, you can quickly deploy the sample solution with minimal manual configuration. The template requires the following parameters:

The Summary instructions are read into your Lambda function as an environment variable.

 # Use the provided instructions to provide the summary. Use a default if no intructions are provided.SUMMARY_INSTRUCTIONS = os.getenv('SUMMARY_INSTRUCTIONS') These are then used as part of your payload to Anthropic’s Claude 3 Haiku model. This is shared to give you an understanding of how to pass the instructions and text to the model. # Create the payload to provide to the Anthropic model.        user_message = {"role": "user", "content": f"{SUMMARY_INSTRUCTIONS}{transcript}"}        messages = [user_message]response = generate_message(bedrock_client, 'anthropic.claude-3-haiku-20240307-v1:0"', "", messages, 1000) The generate_message() function contains the invocation to Amazon Bedrock with the guardrail ID and other relevant parameters. def generate_message(bedrock_runtime, model_id, system_prompt, messages, max_tokens):    body = json.dumps(        {            "anthropic_version": "bedrock-2023-05-31",            "max_tokens": max_tokens,            "system": system_prompt,            "messages": messages        }    )print(f'Invoking model: {BEDROCK_MODEL_ID}')     response = bedrock_runtime.invoke_model(        body=body,        modelId=BEDROCK_MODEL_ID,        # contentType=contentType,        guardrailIdentifier =BEDROCK_GUARDRAIL_ID,        guardrailVersion ="1",        trace ="ENABLED")    response_body = json.loads(response.get('body').read())    print(f'response: {response}')    return response_body

Deploy the solution

After you deploy the resources using AWS CloudFormation, complete these steps:

    Add a Lambda layer.

Although AWS Lambda regularly updates the version of AWS Boto3 included, at the time of writing this post, it still provides version 1.34.126. To use Amazon Bedrock Guardrails, you need version 1.34.90 or higher, for which we’ll add a Lambda layer that updates the Boto3. You can follow the official developer guide on how to add a Lambda layer.

There are different ways to create a Lambda layer. A simple method is to use the steps outlined in Packaging the layer content, which references a sample application repo. You should be able to replace requests==2.31.0 within requirements.txt content to boto3, which will install the latest available version, then create the layer.

To add the layer to Lambda, make sure that the parameters specified in Creating the layer match the deployed Lambda. That is, you need to update compatible-architectures to x86_64.

    Acknowledge the Amazon SNS email confirmation that you should receive a few moments after creating the CloudFormation stack On the AWS CloudFormation console, find the stack you just created On the stack’s Outputs tab, look for the value associated with AssetBucketName. It will look something like summary-generator-assetbucket-xxxxxxxxxxxxx. On the Amazon S3 console, find your S3 assets bucket.

This is where you’ll upload your recordings. Valid file formats are MP3, MP4, WAV, FLAC, AMR, OGG, and WebM.

    Upload your recording to the recordings folder in Amazon S3

Uploading recordings will automatically trigger the AWS Step Functions state machine. For this example, we use a sample team meeting recording from the sample recording.

    On the AWS Step Functions console, find the summary-generator state machine. Choose the name of the state machine run with the status Running.

Here, you can watch the progress of the state machine as it processes the recording. After it reaches its Success state, you should receive an emailed summary of the recording. Alternatively, you can navigate to the S3 assets bucket and view the transcript there in the transcripts folder.

Expand the solution

Now that you have a working solution, here are some potential ideas to customize the solution for your specific use cases:

Clean up

Clean up the resources you created for this solution to avoid incurring costs. You can use an AWS SDK, the AWS Command Line Interface (AWS CLI), or the console.

    Delete Amazon Bedrock Guardrails and the Lambda layer you created Delete the CloudFormation stack

To use the console, follow these steps:

    On the Amazon Bedrock console, in the navigation menu, select Guardrails. Choose your guardrail, then select Delete. On the AWS Lambda console, in the navigation menu, select Layers. Choose your layer, then select Delete. On the AWS CloudFormation console, in the navigation menu, select Stacks. Choose the stack you created, then select Delete.

Deleting the stack won’t delete the associated S3 bucket. If you no longer require the recordings or transcripts, you can delete the bucket separately. Amazon Transcribe is designed to automatically delete transcription jobs after 90 days. However, you can opt to manually delete these jobs before the 90-day retention period expires.

Conclusion

As businesses turn to data as a foundation for decision-making, having the ability to efficiently extract insights from audio recordings is invaluable. By using the power of generative AI with Amazon Bedrock and Amazon Transcribe, your organization can create concise summaries of audio recordings while maintaining privacy and compliance. The proposed architecture demonstrates how AWS services can be orchestrated using AWS Step Functions to streamline and automate complex workflows, enabling organizations to focus on their core business activities.

This solution not only saves time and effort, but also makes sure that sensitive information is redacted, mitigating potential risks and promoting compliance with data protection regulations. As organizations continue to generate and process large volumes of audio data, solutions like this will become increasingly important for gaining insights, making informed decisions, and maintaining a competitive edge.


About the authors

Yash Yamsanwar is a Machine Learning Architect at Amazon Web Services (AWS). He is responsible for designing high-performance, scalable machine learning infrastructure that optimizes the full lifecycle of machine learning models, from training to deployment. Yash collaborates closely with ML research teams to push the boundaries of what is possible with LLMs and other cutting-edge machine learning technologies.

Sawyer Hirt is a Solutions Architect at AWS, specializing in AI/ML and cloud architectures, with a passion for helping businesses leverage cutting-edge technologies to overcome complex challenges. His expertise lies in designing and optimizing ML workflows, enhancing system performance, and making advanced AI solutions more accessible and cost-effective, with a particular focus on Generative AI. Outside of work, Sawyer enjoys traveling, spending time with family, and staying current with the latest developments in cloud computing and artificial intelligence.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Transcribe Amazon Bedrock 语音记录 摘要 敏感信息屏蔽 AI
相关文章