AWS Machine Learning Blog 2024年07月03日
Accenture creates a custom memory-persistent conversational user experience using Amazon Q Business
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Q Business 是一款基于生成式 AI 的助手,能够基于企业系统中的数据和信息,回答问题、提供摘要、生成内容并安全地完成任务。本文以保险公司为例,介绍了 Accenture 如何利用 Amazon Q Business 实现了一个聊天机器人应用,该应用具有直观的附件和对话 ID 管理功能,可以加速开发工作流,并通过 Amazon Q Business 的强大功能,为用户提供更智能、更便捷的文档分析和信息获取体验。

🤔 Amazon Q Business 是一款基于生成式 AI 的助手,能够基于企业系统中的数据和信息,回答问题、提供摘要、生成内容并安全地完成任务。它通过与各种数据源的无缝集成,提供个性化的 AI 辅助,并能提供准确、特定于上下文的响应,与通常需要复杂设置才能实现类似个性化程度的基础模型形成对比。

🚀 Amazon Q Business 实时、定制化的解决方案可以推动企业决策和运营效率的提升,使其在获得即时、可操作的见解方面更胜一筹。

💡 本文以保险公司为例,介绍了 Accenture 如何利用 Amazon Q Business 实现了一个聊天机器人应用,该应用具有直观的附件和对话 ID 管理功能,可以加速开发工作流,并通过 Amazon Q Business 的强大功能,为用户提供更智能、更便捷的文档分析和信息获取体验。

💻 该解决方案使用 Amazon Q Business 的 API 与 Amazon Q Business 交互,并在对话中发送和接收消息,具体来说,使用的是 boto3 库中的 qbusiness 客户端。

⚙️ 该解决方案的架构流程包括:LLM 包装器应用程序代码使用 AWS CodePipeline 进行容器化;应用程序部署到 Amazon Elastic Container Service (Amazon ECS);用户通过 Amazon Cognito UI 进行身份验证;运行在 Amazon ECS 上的 Streamlit 应用程序接收经过身份验证的用户的请求;创建 AmazonQ 类实例;将附加到 Streamlit 状态的文档传递给 AmazonQ 类实例,该实例跟踪对话中已共享的附件和新上传的附件之间的差异。

🌐 该解决方案可以使用 Streamlit、Python 和 AWS 服务开发一个 Web 应用程序,该应用程序具有一个聊天界面,用户可以在该界面上与 AI 助手交互,以提出问题或上传 PDF 文档进行分析。

📊 该解决方案使用 Amazon Q Business 进行对话历史记录管理、对知识库进行向量化、创建上下文和 NLP。这些技术的集成允许用户与 AI 助手之间进行无缝通信,从而实现文档摘要、问答以及基于实时附加文档的多个文档比较等任务。

🧪 该解决方案使用德语对 10 个不同的文档和 10 个不同的用例测试了 RAG LLM 实现。策略文档经过预处理并存储,从而可以根据输入查询准确地检索相关部分。该测试证明了该系统在处理德语策略比较方面的准确性和有效性。

💡 该解决方案使用 Streamlit、Python 和 AWS 服务开发一个 Web 应用程序,该应用程序具有一个聊天界面,用户可以在该界面上与 AI 助手交互,以提出问题或上传 PDF 文档进行分析。

📊 该解决方案使用 Amazon Q Business 进行对话历史记录管理、对知识库进行向量化、创建上下文和 NLP。这些技术的集成允许用户与 AI 助手之间进行无缝通信,从而实现文档摘要、问答以及基于实时附加文档的多个文档比较等任务。

🧪 该解决方案使用德语对 10 个不同的文档和 10 个不同的用例测试了 RAG LLM 实现。策略文档经过预处理并存储,从而可以根据输入查询准确地检索相关部分。该测试证明了该系统在处理德语策略比较方面的准确性和有效性。

Traditionally, finding relevant information from documents has been a time-consuming and often frustrating process. Manually sifting through pages upon pages of text, searching for specific details, and synthesizing the information into coherent summaries can be a daunting task. This inefficiency not only hinders productivity but also increases the risk of overlooking critical insights buried within the document’s depths.

Imagine a scenario where a call center agent needs to quickly analyze multiple documents to provide summaries for clients. Previously, this process would involve painstakingly navigating through each document, a task that is both time-consuming and prone to human error.

With the advent of chatbots in the conversational artificial intelligence (AI) domain, you can now upload your documents through an intuitive interface and initiate a conversation by asking specific questions related to your inquiries. The chatbot then analyzes the uploaded documents, using advanced natural language processing (NLP) and machine learning (ML) technologies to provide comprehensive summaries tailored to your questions.

However, the true power lies in the chatbot’s ability to preserve context throughout the conversation. As you navigate through the discussion, the chatbot should maintain a memory of previous interactions, allowing you to review past discussions and retrieve specific details as needed. This seamless experience makes sure you can effortlessly explore the depths of your documents without losing track of the conversation’s flow.

Amazon Q Business is a generative AI-powered assistant that can answer questions, provide summaries, generate content, and securely complete tasks based on data and information in your enterprise systems. It empowers employees to be more creative, data-driven, efficient, prepared, and productive.

This post demonstrates how Accenture used Amazon Q Business to implement a chatbot application that offers straightforward attachment and conversation ID management. This solution can speed up your development workflow, and you can use it without crowding your application code.

“Amazon Q Business distinguishes itself by delivering personalized AI assistance through seamless integration with diverse data sources. It offers accurate, context-specific responses, contrasting with foundation models that typically require complex setup for similar levels of personalization. Amazon Q Business real-time, tailored solutions drive enhanced decision-making and operational efficiency in enterprise settings, making it superior for immediate, actionable insights”

– Dominik Juran, Cloud Architect, Accenture

Solution overview

In this use case, an insurance provider uses a Retrieval Augmented Generation (RAG) based large language model (LLM) implementation to upload and compare policy documents efficiently. Policy documents are preprocessed and stored, allowing the system to retrieve relevant sections based on input queries. This enhances the accuracy, transparency, and speed of policy comparison, making sure clients receive the best coverage options.

This solution augments an Amazon Q Business application with persistent memory and context tracking throughout conversations. As users pose follow-up questions, Amazon Q Business can continually refine responses while recalling previous interactions. This preserves conversational flow when navigating in-depth inquiries.

At the core of this use case lies the creation of a custom Python class for Amazon Q Business, which streamlines the development workflow for this solution. This class offers robust document management capabilities, keeping track of attachments already shared within a conversation as well as new uploads to the Streamlit application. Additionally, it maintains an internal state to persist conversation IDs for future interactions, providing a seamless user experience.

The solution involves developing a web application using Streamlit, Python, and AWS services, featuring a chat interface where users can interact with an AI assistant to ask questions or upload PDF documents for analysis. Behind the scenes, the application uses Amazon Q Business for conversation history management, vectorizing the knowledge base, context creation, and NLP. The integration of these technologies allows for seamless communication between the user and the AI assistant, enabling tasks such as document summarization, question answering, and comparison of multiple documents based on the documents attached in real time.

The code uses Amazon Q Business APIs to interact with Amazon Q Business and send and receive messages within a conversation, specifically the qbusiness client from the boto3 library.

In this use case, we used the German language to test our RAG LLM implementation on 10 different documents and 10 different use cases. Policy documents were preprocessed and stored, enabling accurate retrieval of relevant sections based on input queries. This testing demonstrated the system’s accuracy and effectiveness in handling German language policy comparisons.

The following is a code snippet:

import boto3import jsonfrom botocore.exceptions import ClientErrorfrom os import environclass AmazonQHandler:    def __init__(self, application_id, user_id, conversation_id, system_message_id):        self.application_id = application_id        self.user_id = user_id        self.qbusiness = boto3.client('qbusiness')        self.prompt_engineering_instruction = "Ansage: Auf Deutsch, und nur mit den nötigsten Wörter ohne ganze Sätze antworten, bitte"        self.parent_message_id = system_message_id        self.conversation_id = conversation_id    def process_message(self, initial_message, input_text):        print('Please ask as many questions as you want. At the end of the session write exit\n')                message = f'{self.prompt_engineering_instruction}: {input_text}'                    return message    def send_message(self, input_text, uploaded_file_names=[]):        attachments = []        message = f'{self.prompt_engineering_instruction}: {input_text}'        if len(uploaded_file_names) > 0:            for file_name in uploaded_file_names:                in_file = open(file_name, "rb")                data = in_file.read()                attachments.append({                    'data': data,                    'name': file_name                })        if self.conversation_id:            print("we are in if part of send_message")            if len(attachments) > 0:                resp = self.qbusiness.chat_sync(                    applicationId=self.application_id,                    userId=self.user_id,                    userMessage=message,                    conversationId=self.conversation_id,                    parentMessageId=self.parent_message_id,                    attachments=attachments,                )            else:                resp = self.qbusiness.chat_sync(                    applicationId=self.application_id,                    userId=self.user_id,                    userMessage=message,                    conversationId=self.conversation_id,                    parentMessageId=self.parent_message_id,                )        else:            if len(attachments) > 0:                resp = self.qbusiness.chat_sync(                    applicationId=self.application_id,                    userId=self.user_id,                    userMessage=message,                    attachments=attachments,                )            else:                 resp = self.qbusiness.chat_sync(                    applicationId=self.application_id,                    userId=self.user_id,                    userMessage=message,                )            self.conversation_id = resp.get("conversationId")        print(f'Amazon Q: "{resp.get("systemMessage")}"\n')        print(json.dumps(resp))        self.parent_message_id = resp.get("systemMessageId")        return resp.get("systemMessage")if __name__ == '__main__':    application_id = environ.get("APPLICATION_ID", "a392f5e9-50ed-4f93-bcad-6f8a26a8212d")    user_id = environ.get("USER_ID", "AmazonQ-Administrator")    amazon_q_handler = AmazonQHandler(application_id, user_id)    amazon_q_handler.process_message(None)

The architectural flow of this solution is shown in the following diagram.

The workflow consists of the following steps:

    The LLM wrapper application code is containerized using AWS CodePipeline, a fully managed continuous delivery service that automates the build, test, and deploy phases of the software release process. The application is deployed to Amazon Elastic Container Service (Amazon ECS), a highly scalable and reliable container orchestration service that provides optimal resource utilization and high availability. Because we were making the calls from a Flask-based ECS task running Streamlit to Amazon Q Business, we used Amazon Cognito user pools rather than AWS IAM Identity Center to authenticate users for simplicity, and we hadn’t experimented with IAM Identity Center on Amazon Q Business at the time. For instructions to set up IAM Identity Center integration with Amazon Q Business, refer to Setting up Amazon Q Business with IAM Identity Center as identity provider. Users authenticate through an Amazon Cognito UI, a secure user directory that scales to millions of users and integrates with various identity providers. A Streamlit application running on Amazon ECS receives the authenticated user’s request. An instance of the custom AmazonQ class is initiated. If an ongoing Amazon Q Business conversation is present, the correct conversation ID is persisted, providing continuity. If no existing conversation is found, a new conversation is initiated. Documents attached to the Streamlit state are passed to the instance of the AmazonQ class, which keeps track of the delta between the documents already attached to the conversation ID and the documents yet to be shared. This approach respects and optimizes the five-attachment limit imposed by Amazon Q Business. To simplify and avoid repetitions in the middleware library code we are maintaining on the Streamlit application, we decided to write a custom wrapper class for the Amazon Q Business calls, which keeps the attachment and conversation history management in itself as class variables (as opposed to state-based management on the Streamlit level). Our wrapper Python class encapsulating the Amazon Q Business instance parses and returns the answers based on the conversation ID and the dynamically provided context derived from the user’s question. Amazon ECS serves the answer to the authenticated user, providing a secure and scalable delivery of the response.

Prerequisites

This solution has the following prerequisites:

Deploy the solution

The deployment process entails provisioning the required AWS infrastructure, configuring environment variables, and deploying the application code. This is accomplished by using AWS services such as CodePipeline and Amazon ECS for container orchestration and Amazon Q Business for NLP.

Additionally, Amazon Cognito is integrated with Amazon ECS using the AWS Cloud Development Kit (AWS CDK) and user pools are used for user authentication and management. After deployment, you can access the application through a web browser. Amazon Q Business is called from the ECS task. It is crucial to establish proper access permissions and security measures to safeguard user data and uphold the application’s integrity.

We use AWS CDK to deploy a web application using Amazon ECS with AWS Fargate, Amazon Cognito for user authentication, and AWS Certificate Manager for SSL/TLS certificates.

To deploy the infrastructure, run the following commands:

The following screenshot shows our deployed CloudFormation stack.

UI demonstration

The following screenshot shows the home page when a user opens the application in a web browser.

The following screenshot shows an example response from Amazon Q Business when no file was uploaded and no relevant answer to the question was found.

The following screenshot illustrates the entire application flow, where the user asked a question before a file was uploaded, then uploaded a file, and asked the same question again. The response from Amazon Q Business after uploading the file is different from the first query (for testing purposes, we used a very simple file with randomly generated text in PDF format).

Solution benefits

This solution offers the following benefits:

This containerized architecture allows the solution to scale seamlessly while optimizing request throughput. Persisting the conversation state enhances precision by continuously expanding dialog context. Overall, this solution can help you balance performance with the fidelity of a persistent, context-aware AI assistant through Amazon Q Business.

Clean up

After deployment, you should implement a thorough cleanup plan to maintain efficient resource management and mitigate unnecessary costs, particularly concerning the AWS services used in the deployment process. This plan should include the following key steps:

By diligently implementing these cleanup procedures, you can effectively minimize expenses, optimize resource usage, and maintain a tidy environment for future development iterations or deployments. Additionally, regular review and adjustment of AWS services and configurations is recommended to provide ongoing cost-effectiveness and operational efficiency.

If the solution runs in AWS Amplify or is provisioned by the AWS CDK, you don’t need to take care of removing everything described in this section; deleting the Amplify application or AWS CDK stack is enough to get rid all of the resources associated with the application.

Conclusion

In this post, we showcased how Accenture created a custom memory-persistent conversational assistant using AWS generative AI services. The solution can cater to clients developing end-to-end conversational persistent chatbot applications at a large scale following the provided architectural practices and guidelines.

The joint effort between Accenture and AWS builds on the 15-year strategic relationship between the companies and uses the same proven mechanisms and accelerators built by the Accenture AWS Business Group (AABG). Connect with the AABG team at accentureaws@amazon.com to drive business outcomes by transforming to an intelligent data enterprise on AWS.

For further information about generative AI on AWS using Amazon Bedrock or Amazon Q Business, we recommend the following resources:

You can also sign up for the AWS generative AI newsletter, which includes educational resources, post posts, and service updates.


About the Authors

Dominik Juran works as a full stack developer at Accenture with a focus on AWS technologies and AI. He also has a passion for ice hockey.

Milica Bozic works as Cloud Engineer at Accenture, specializing in AWS Cloud solutions for the specific needs of clients with background in telecommunications, particularly 4G and 5G technologies. Mili is passionate about art, books, and movement training, finding inspiration in creative expression and physical activity.

Zdenko Estok works as a cloud architect and DevOps engineer at Accenture. He works with AABG to develop and implement innovative cloud solutions, and specializes in infrastructure as code and cloud security. Zdenko likes to bike to the office and enjoys pleasant walks in nature.

Selimcan “Can” Sakar is a cloud first developer and solution architect at Accenture with a focus on artificial intelligence and a passion for watching models converge.

Shikhar Kwatra is a Sr. AI/ML Specialist Solutions Architect at Amazon Web Services, working with leading Global System Integrators. He has earned the title of one of the Youngest Indian Master Inventors with over 500 patents in the AI/ML and IoT domains. Shikhar aids in architecting, building, and maintaining cost-efficient, scalable cloud environments for the organization, and supports the GSI partner in building strategic industry solutions on AWS. Shikhar enjoys playing guitar, composing music, and practicing mindfulness in his spare time.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Q Business AI 聊天机器人 文档分析 企业应用
相关文章