AWS Machine Learning Blog 2024年10月08日
Build a generative AI Slack chat assistant using Amazon Bedrock and Amazon Kendra
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

在企业环境中,尽管信息和数据大量涌现,员工和利益相关者仍然经常发现自己需要搜索信息,并难以快速有效地获得问题的答案。这会导致生产力下降、沮丧和决策延迟。生成式 AI Slack 聊天助手可以通过为用户提供一个随时可用的智能界面来解决这些挑战,用户可以通过该界面进行交互并获取所需的信息。通过利用生成式 AI 的自然语言处理和生成能力,聊天助手可以理解用户查询,从各种数据源检索相关信息,并提供量身定制的上下文响应。通过利用生成式 AI 的强大功能和 Amazon Web Services (AWS) 服务 Amazon Bedrock、Amazon Kendra 和 Amazon Lex,此解决方案提供了一个示例架构,用于构建一个智能 Slack 聊天助手,该助手可以简化信息访问、增强用户体验,并推动组织内的生产力和效率。

😄 **Amazon Kendra 的优势:** Amazon Kendra 是一款全托管服务,它为最先进的文档和段落排名提供开箱即用的语义搜索功能。您可以使用 Amazon Kendra 在企业数据上快速构建高精度生成式 AI 应用程序,并提供最相关的內容和文档,以最大限度地提高检索增强生成 (RAG) 负载的质量,从而比使用传统或基于关键字的搜索解决方案产生更好的大型语言模型 (LLM) 响应。Amazon Kendra 提供易于使用的深度学习搜索模型,这些模型已在 14 个领域预先训练,不需要机器学习 (ML) 专业知识。Amazon Kendra 可以索引来自各种来源的內容,包括数据库、内容管理系统、文件共享和网页。

😊 **FAQ 功能:** Amazon Kendra 中的 FAQ 功能补充了该服务的更广泛的检索功能,使 RAG 系统能够在提供预先编写的 FAQ 响应和通过查询更大的知识库动态生成响应之间无缝切换。这使得它非常适合为 RAG 系统的检索组件提供支持,使模型能够在生成响应时访问广泛的知识库。通过将 Amazon Kendra 的 FAQ 功能集成到 RAG 系统中,模型可以使用一组经过精心挑选的高质量、权威答案来回答常见问题。这可以提高整体响应质量和用户体验,同时减轻语言模型从头开始生成这些基本响应的负担。

😉 **解决方案概述:** 聊天助手旨在通过回答用户的问题并提供有关各种主题的信息来帮助用户。聊天助手的目的是成为一个面向内部的 Slack 工具,可以帮助员工和利益相关者找到他们需要的信息。该架构使用 Amazon Lex 进行意图识别,使用 AWS Lambda 处理查询,使用 Amazon Kendra 搜索 FAQ 和网络內容,以及使用 Amazon Bedrock 生成由 LLM 支持的上下文响应。通过结合这些服务,聊天助手可以理解自然语言查询,从多个数据源检索相关信息,并提供适合用户需求的类似人类的响应。该解决方案展示了生成式 AI 在创建智能虚拟助手方面的强大功能,这些助手可以根据模型选择、FAQ 和修改系统提示和推理参数来简化工作流程并增强用户体验。

😁 **架构图:** 下图说明了 RAG 方法,其中用户通过 Slack 应用程序发送查询,并根据 Amazon Kendra 中索引的数据接收生成的响应。在这篇文章中,我们使用 Amazon Kendra Web Crawler 作为数据源,并包括存储在 Amazon Simple Storage Service (Amazon S3) 上的 FAQ。有关 Amazon Kendra 支持的数据源连接器列表,请参阅数据源连接器。

🤩 **步骤流程:** 用户通过 Slack 应用程序发送一个查询,例如“什么是 AWS Well-Architected Framework?”。查询转到 Amazon Lex,它识别意图。目前,在 Amazon Lex 中配置了两个意图(欢迎和 FallbackIntent)。欢迎意图配置为在用户输入问候语(例如“hi”或“hello”)时进行响应。助手会回复“您好!我可以帮助您根据提供的文档进行查询。请问您有什么问题?”。回退意图由 Lambda 函数完成。Lambda 函数通过 search_Kendra_FAQ 方法搜索 Amazon Kendra FAQ,方法是将用户查询和 Amazon Kendra 索引 ID 作为输入。如果与高置信度得分匹配,则将来自 FAQ 的答案返回给用户。如果没有与足够高的置信度得分匹配,则通过 kendra_retrieve_document 方法检索来自 Amazon Kendra 的具有高置信度得分的相关文档,并将这些文档发送到 Amazon Bedrock 以生成响应作为上下文。最后,通过 invokeLLM 方法从 Amazon Bedrock 生成响应。以下是在满足函数中 invokeLLM 方法的代码片段。阅读有关推理参数和系统提示的更多信息,以修改传递到 Amazon Bedrock 调用模型请求中的参数。

Despite the proliferation of information and data in business environments, employees and stakeholders often find themselves searching for information and struggling to get their questions answered quickly and efficiently. This can lead to productivity losses, frustration, and delays in decision-making.

A generative AI Slack chat assistant can help address these challenges by providing a readily available, intelligent interface for users to interact with and obtain the information they need. By using the natural language processing and generation capabilities of generative AI, the chat assistant can understand user queries, retrieve relevant information from various data sources, and provide tailored, contextual responses.

By harnessing the power of generative AI and Amazon Web Services (AWS) services Amazon Bedrock, Amazon Kendra, and Amazon Lex, this solution provides a sample architecture to build an intelligent Slack chat assistant that can streamline information access, enhance user experiences, and drive productivity and efficiency within organizations.

Why use Amazon Kendra for building a RAG application?

Amazon Kendra is a fully managed service that provides out-of-the-box semantic search capabilities for state-of-the-art ranking of documents and passages. You can use Amazon Kendra to quickly build high-accuracy generative AI applications on enterprise data and source the most relevant content and documents to maximize the quality of your Retrieval Augmented Generation (RAG) payload, yielding better large language model (LLM) responses than using conventional or keyword-based search solutions. Amazon Kendra offers simple-to-use deep learning search models that are pre-trained on 14 domains and don’t require machine learning (ML) expertise. Amazon Kendra can index content from a wide range of sources, including databases, content management systems, file shares, and web pages.

Further, the FAQ feature in Amazon Kendra complements the broader retrieval capabilities of the service, allowing the RAG system to seamlessly switch between providing prewritten FAQ responses and dynamically generating responses by querying the larger knowledge base. This makes it well-suited for powering the retrieval component of a RAG system, allowing the model to access a broad knowledge base when generating responses. By integrating the FAQ capabilities of Amazon Kendra into a RAG system, the model can use a curated set of high-quality, authoritative answers for commonly asked questions. This can improve the overall response quality and user experience, while also reducing the burden on the language model to generate these basic responses from scratch.

This solution balances retaining customizations in terms of model selection, prompt engineering, and adding FAQs with not having to deal with word embeddings, document chunking, and other lower-level complexities typically required for RAG implementations.

Solution overview

The chat assistant is designed to assist users by answering their questions and providing information on a variety of topics. The purpose of the chat assistant is to be an internal-facing Slack tool that can help employees and stakeholders find the information they need.

The architecture uses Amazon Lex for intent recognition, AWS Lambda for processing queries, Amazon Kendra for searching through FAQs and web content, and Amazon Bedrock for generating contextual responses powered by LLMs. By combining these services, the chat assistant can understand natural language queries, retrieve relevant information from multiple data sources, and provide humanlike responses tailored to the user’s needs. The solution showcases the power of generative AI in creating intelligent virtual assistants that can streamline workflows and enhance user experiences based on model choices, FAQs, and modifying system prompts and inference parameters.

Architecture diagram

The following diagram illustrates a RAG approach where the user sends a query through the Slack application and receives a generated response based on the data indexed in Amazon Kendra. In this post, we use Amazon Kendra Web Crawler as the data source and include FAQs stored on Amazon Simple Storage Service (Amazon S3). See Data source connectors for a list of supported data source connectors for Amazon Kendra.

The step-by-step workflow for the architecture is the following:

    The user sends a query such as What is the AWS Well-Architected Framework? through the Slack app. The query goes to Amazon Lex, which identifies the intent. Currently two intents are configured in Amazon Lex (Welcome and FallbackIntent). The welcome intent is configured to respond with a greeting when a user enters a greeting such as “hi” or “hello.” The assistant responds with “Hello! I can help you with queries based on the documents provided. Ask me a question.” The fallback intent is fulfilled with a Lambda function.
      The Lambda function searches Amazon Kendra FAQs through the search_Kendra_FAQ method by taking the user query and Amazon Kendra index ID as inputs. If there’s a match with a high confidence score, the answer from the FAQ is returned to the user.
      def search_Kendra_FAQ(question, kendra_index_id):    """    This function takes in the question from the user, and checks if the question exists in the Kendra FAQs.    :param question: The question the user is asking that was asked via the frontend input text box.    :param kendra_index_id: The kendra index containing the documents and FAQs    :return: If found in FAQs, returns the answer along with any relevant links. If not, returns False and then calls kendra_retrieve_document function.    """    kendra_client = boto3.client('kendra')    response = kendra_client.query(IndexId=kendra_index_id, QueryText=question, QueryResultTypeFilter='QUESTION_ANSWER')    for item in response['ResultItems']:        score_confidence = item['ScoreAttributes']['ScoreConfidence']        # Taking answers from FAQs that have a very high confidence score only        if score_confidence == 'VERY_HIGH' and len(item['AdditionalAttributes']) > 1:            text = item['AdditionalAttributes'][1]['Value']['TextWithHighlightsValue']['Text']            url = "None"            if item['DocumentURI'] != '':                url = item['DocumentURI']            return (text, url)    return (False, False)
      If there isn’t a match with a high enough confidence score, relevant documents from Amazon Kendra with a high confidence score are retrieved through the kendra_retrieve_document method and sent to Amazon Bedrock to generate a response as the context.
      def kendra_retrieve_document(question, kendra_index_id):    """    This function takes in the question from the user, and retrieves relevant passages based on default PageSize of 10.    :param question: The question the user is asking that was asked via the frontend input text box.    :param kendra_index_id: The kendra index containing the documents and FAQs    :return: Returns the context to be sent to the LLM and document URIs to be returned as relevant data sources.    """    kendra_client = boto3.client('kendra')    documents = kendra_client.retrieve(IndexId=kendra_index_id, QueryText=question)    text = ""    uris = set()    if len(documents['ResultItems']) > 0:        for i in range(len(documents['ResultItems'])):            score_confidence = documents['ResultItems'][i]['ScoreAttributes']['ScoreConfidence']            if score_confidence == 'VERY_HIGH' or score_confidence == 'HIGH':                text += documents['ResultItems'][i]['Content'] + "\n"                uris.add(documents['ResultItems'][i]['DocumentURI'])    return (text, uris)
      The response is generated from Amazon Bedrock with the invokeLLM method. The following is a snippet of the invokeLLM method within the fulfillment function. Read more on inference parameters and system prompts to modify parameters that are passed into the Amazon Bedrock invoke model request.
      def invokeLLM(question, context, modelId):    """    This function takes in the question from the user, along with the Kendra responses as context to generate an answer    for the user on the frontend.    :param question: The question the user is asking that was asked via the frontend input text box.    :param documents: The response from the Kendra document retrieve query, used as context to generate a better    answer.    :return: Returns the final answer that will be provided to the end-user of the application who asked the original    question.    """    # Setup Bedrock client    bedrock = boto3.client('bedrock-runtime')    # configure model specifics such as specific model    modelId = modelId    # body of data with parameters that is passed into the bedrock invoke model request    body = json.dumps({"max_tokens": 350,            "system": "You are a truthful AI assistant. Your goal is to provide informative and substantive responses to queries based on the documents provided. If you do not know the answer to a question, you truthfully say you do not know.",            "messages": [{"role": "user", "content": "Answer this user query:" + question + "with the following context:" + context}],            "anthropic_version": "bedrock-2023-05-31",                "temperature":0,            "top_k":250,            "top_p":0.999})    # Invoking the bedrock model with your specifications    response = bedrock.invoke_model(body=body,                                    modelId=modelId)    # the body of the response that was generated    response_body = json.loads(response.get('body').read())    # retrieving the specific completion field, where you answer will be    answer = response_body.get('content')    # returning the answer as a final result, which ultimately gets returned to the end user    return answer
      Finally, the response generated from Amazon Bedrock along with the relevant referenced URLs are returned to the end user.

    When selecting websites to index, adhere to the AWS Acceptable Use Policy and other AWS terms. Remember that you can only use Amazon Kendra Web Crawler to index your own web pages or web pages that you have authorization to index. Visit the Amazon Kendra Web Crawler data source guide to learn more about using the web crawler as a data source. Using Amazon Kendra Web Crawler to aggressively crawl websites or web pages you don’t own is not considered acceptable use.

    Supported features

    The chat assistant supports the following features:

      Support for the following Anthropic’s models on Amazon Bedrock:
        claude-v2 claude-3-haiku-20240307-v1:0 claude-instant-v1 claude-3-sonnet-20240229-v1:0
      Support for FAQs and the Amazon Kendra Web Crawler data source Returns FAQ answers only if the confidence score is VERY_HIGH Retrieves only documents from Amazon Kendra that have a HIGH or VERY_HIGH confidence score If documents with a high confidence score aren’t found, the chat assistant returns “No relevant documents found”

    Prerequisites

    To perform the solution, you need to have following prerequisites:

      Basic knowledge of AWS An AWS account with access to Amazon S3 and Amazon Kendra An S3 bucket to store your documents. For more information, see Step 1: Create your first S3 bucket and the Amazon S3 User Guide. A Slack workspace to integrate the chat assistant Permission to install Slack apps in your Slack workspace Seed URLs for the Amazon Kendra Web Crawler data source
        You’ll need authorization to crawl and index any websites provided
      AWS CloudFormation for deploying the solution resources

    Build a generative AI Slack chat assistant

    To build a Slack application, use the following steps:

      Request model access on Amazon Bedrock for all Anthropic models Create an S3 bucket in the us-east-1 (N. Virginia) AWS Region. Upload the AIBot-LexJson.zip and SampleFAQ.csv files to the S3 bucket Launch the CloudFormation stack in the us-east-1 (N. Virginia) AWS Region. Enter a Stack name of your choice For S3BucketName, enter the name of the S3 bucket created in Step 2 For S3KendraFAQKey, enter the name of the SampleFAQs uploaded to the S3 bucket in Step 3 For S3LexBotKey, enter the name of the Amazon Lex .zip file uploaded to the S3 bucket in Step 3 For SeedUrls, enter up to 10 URLs for the web crawler as a comma delimited list. In the example in this post, we give the publicly available Amazon Bedrock service page as the seed URL Leave the rest as defaults. Choose Next. Choose Next again on the Configure stack options Acknowledge by selecting the box and choose Submit, as shown in the following screenshot
      Wait for the stack creation to complete Verify all resources are created Test on the AWS Management Console for Amazon Lex
        On the Amazon Lex console, choose your chat assistant ${YourStackName}-AIBot Choose Intents Choose Version 1 and choose Test, as shown in the following screenshot
        Select the AIBotProdAlias and choose Confirm, as shown in the following screenshot. If you want to make changes to the chat assistant, you can use the draft version, publish a new version, and assign the new version to the AIBotProdAlias. Learn more about Versioning and Aliases.
        Test the chat assistant with questions such as, “Which AWS service has 11 nines of durability?” and “What is the AWS Well-Architected Framework?” and verify the responses. The following table shows that there are three FAQs in the sample .csv file.
        _question _answer _source_uri
        Which AWS service has 11 nines of durability? Amazon S3 https://aws.amazon.com/s3/
        What is the AWS Well-Architected Framework? The AWS Well-Architected Framework enables customers and partners to review their architectures using a consistent approach and provides guidance to improve designs over time. https://aws.amazon.com/architecture/well-architected/
        In what Regions is Amazon Kendra available? Amazon Kendra is currently available in the following AWS Regions: Northern Virginia, Oregon, and Ireland https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/
        The following screenshot shows the question “Which AWS service has 11 nines of durability?” and its response. You can observe that the response is the same as in the FAQ file and includes a link.
        Based on the pages you have crawled, ask a question in the chat. For this example, the publicly available Amazon Bedrock page was crawled and indexed. The following screenshot shows the question, “What are agents in Amazon Bedrock?” and and a generated response that includes relevant links.
      For integration of the Amazon Lex chat assistant with Slack, see Integrating an Amazon Lex V2 bot with Slack. Choose the AIBotProdAlias under Alias in the Channel Integrations

    Run sample queries to test the solution

      In Slack, go to the Apps section. In the dropdown menu, choose Manage and select Browse apps.
      Search for ${AIBot} in App Directory and choose the chat assistant. This will add the chat assistant to the Apps section in Slack. You can now start asking questions in the chat. The following screenshot shows the question “Which AWS service has 11 nines of durability?” and its response. You can observe that the response is the same as in the FAQ file and includes a link.
      The following screenshot shows the question, “What is the AWS Well-Architected Framework?” and its response.
      Based on the pages you have crawled, ask a question in the chat. For this example, the publicly available Amazon Bedrock page was crawled and indexed. The following screenshot shows the question, “What are agents in Amazon Bedrock?” and and a generated response that includes relevant links.
      The following screenshot shows the question, “What is amazon polly?” Because there is no Amazon Polly documentation indexed, the chat assistant responds with “No relevant documents found,” as expected.

    These examples show how the chat assistant retrieves documents from Amazon Kendra and provides answers based on the documents retrieved. If no relevant documents are found, the chat assistant responds with “No relevant documents found.”

    Clean up

    To clean up the resources created by this solution:

      Delete the CloudFormation stack by navigating to the CloudFormation console Select the stack you created for this solution and choose Delete Confirm the deletion by entering the stack name in the provided field. This will remove all the resources created by the CloudFormation template, including the Amazon Kendra index, Amazon Lex chat assistant, Lambda function, and other related resources.

    Conclusion

    This post describes the development of a generative AI Slack application powered by Amazon Bedrock and Amazon Kendra. This is designed to be an internal-facing Slack chat assistant that helps answer questions related to the indexed content. The solution architecture includes Amazon Lex for intent identification, a Lambda function for fulfilling the fallback intent, Amazon Kendra for FAQ searches and indexing crawled web pages, and Amazon Bedrock for generating responses. The post walks through the deployment of the solution using a CloudFormation template, provides instructions for running sample queries, and discusses the steps for cleaning up the resources. Overall, this post demonstrates how to use various AWS services to build a powerful generative AI–powered chat assistant application.

    This solution demonstrates the power of generative AI in building intelligent chat assistants and search assistants. Explore the generative AI Slack chat assistant: Invite your teams to a Slack workspace and start getting answers to your indexed content and FAQs. Experiment with different use cases and see how you can harness the capabilities of services like Amazon Bedrock and Amazon Kendra to enhance your business operations. For more information about using Amazon Bedrock with Slack, refer to Deploy a Slack gateway for Amazon Bedrock.


    About the authors

    Kruthi Jayasimha Rao is a Partner Solutions Architect with a focus on AI and ML. She provides technical guidance to AWS Partners in following best practices to build secure, resilient, and highly available solutions in the AWS Cloud.

    Mohamed Mohamud is a Partner Solutions Architect with a focus on Data Analytics. He specializes in streaming analytics, helping partners build real-time data pipelines and analytics solutions on AWS. With expertise in services like Amazon Kinesis, Amazon MSK, and Amazon EMR, Mohamed enables data-driven decision-making through streaming analytics.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

生成式 AI Slack 聊天助手 Amazon Kendra Amazon Bedrock
相关文章