Build a generative AI Slack chat assistant using Amazon Bedrock and Amazon Kendra

Despite the proliferation of information and data in business environments, employees and stakeholders often find themselves searching for information and struggling to get their questions answered quickly and efficiently. This can lead to productivity losses, frustration, and delays in decision-making.

A generative AI Slack chat assistant can help address these challenges by providing a readily available, intelligent interface for users to interact with and obtain the information they need. By using the natural language processing and generation capabilities of generative AI, the chat assistant can understand user queries, retrieve relevant information from various data sources, and provide tailored, contextual responses.

By harnessing the power of generative AI and Amazon Web Services (AWS) services Amazon Bedrock, Amazon Kendra, and Amazon Lex, this solution provides a sample architecture to build an intelligent Slack chat assistant that can streamline information access, enhance user experiences, and drive productivity and efficiency within organizations.

Why use Amazon Kendra for building a RAG application?

Amazon Kendra is a fully managed service that provides out-of-the-box semantic search capabilities for state-of-the-art ranking of documents and passages. You can use Amazon Kendra to quickly build high-accuracy generative AI applications on enterprise data and source the most relevant content and documents to maximize the quality of your Retrieval Augmented Generation (RAG) payload, yielding better large language model (LLM) responses than using conventional or keyword-based search solutions. Amazon Kendra offers simple-to-use deep learning search models that are pre-trained on 14 domains and don’t require machine learning (ML) expertise. Amazon Kendra can index content from a wide range of sources, including databases, content management systems, file shares, and web pages.

Further, the FAQ feature in Amazon Kendra complements the broader retrieval capabilities of the service, allowing the RAG system to seamlessly switch between providing prewritten FAQ responses and dynamically generating responses by querying the larger knowledge base. This makes it well-suited for powering the retrieval component of a RAG system, allowing the model to access a broad knowledge base when generating responses. By integrating the FAQ capabilities of Amazon Kendra into a RAG system, the model can use a curated set of high-quality, authoritative answers for commonly asked questions. This can improve the overall response quality and user experience, while also reducing the burden on the language model to generate these basic responses from scratch.

This solution balances retaining customizations in terms of model selection, prompt engineering, and adding FAQs with not having to deal with word embeddings, document chunking, and other lower-level complexities typically required for RAG implementations.

Solution overview

The chat assistant is designed to assist users by answering their questions and providing information on a variety of topics. The purpose of the chat assistant is to be an internal-facing Slack tool that can help employees and stakeholders find the information they need.

The architecture uses Amazon Lex for intent recognition, AWS Lambda for processing queries, Amazon Kendra for searching through FAQs and web content, and Amazon Bedrock for generating contextual responses powered by LLMs. By combining these services, the chat assistant can understand natural language queries, retrieve relevant information from multiple data sources, and provide humanlike responses tailored to the user’s needs. The solution showcases the power of generative AI in creating intelligent virtual assistants that can streamline workflows and enhance user experiences based on model choices, FAQs, and modifying system prompts and inference parameters.

Architecture diagram

The following diagram illustrates a RAG approach where the user sends a query through the Slack application and receives a generated response based on the data indexed in Amazon Kendra. In this post, we use Amazon Kendra Web Crawler as the data source and include FAQs stored on Amazon Simple Storage Service (Amazon S3). See Data source connectors for a list of supported data source connectors for Amazon Kendra.

The step-by-step workflow for the architecture is the following:

What is the AWS Well-Architected Framework?

Welcome

FallbackIntent

search_Kendra_FAQ

def search_Kendra_FAQ(question, kendra_index_id):    """    This function takes in the question from the user, and checks if the question exists in the Kendra FAQs.    :param question: The question the user is asking that was asked via the frontend input text box.    :param kendra_index_id: The kendra index containing the documents and FAQs    :return: If found in FAQs, returns the answer along with any relevant links. If not, returns False and then calls kendra_retrieve_document function.    """    kendra_client = boto3.client('kendra')    response = kendra_client.query(IndexId=kendra_index_id, QueryText=question, QueryResultTypeFilter='QUESTION_ANSWER')    for item in response['ResultItems']:        score_confidence = item['ScoreAttributes']['ScoreConfidence']        # Taking answers from FAQs that have a very high confidence score only        if score_confidence == 'VERY_HIGH' and len(item['AdditionalAttributes']) > 1:            text = item['AdditionalAttributes'][1]['Value']['TextWithHighlightsValue']['Text']            url = "None"            if item['DocumentURI'] != '':                url = item['DocumentURI']            return (text, url)    return (False, False)

kendra_retrieve_document

def kendra_retrieve_document(question, kendra_index_id):    """    This function takes in the question from the user, and retrieves relevant passages based on default PageSize of 10.    :param question: The question the user is asking that was asked via the frontend input text box.    :param kendra_index_id: The kendra index containing the documents and FAQs    :return: Returns the context to be sent to the LLM and document URIs to be returned as relevant data sources.    """    kendra_client = boto3.client('kendra')    documents = kendra_client.retrieve(IndexId=kendra_index_id, QueryText=question)    text = ""    uris = set()    if len(documents['ResultItems']) > 0:        for i in range(len(documents['ResultItems'])):            score_confidence = documents['ResultItems'][i]['ScoreAttributes']['ScoreConfidence']            if score_confidence == 'VERY_HIGH' or score_confidence == 'HIGH':                text += documents['ResultItems'][i]['Content'] + "\n"                uris.add(documents['ResultItems'][i]['DocumentURI'])    return (text, uris)

invokeLLM

inference parameters

system prompts

def invokeLLM(question, context, modelId):    """    This function takes in the question from the user, along with the Kendra responses as context to generate an answer    for the user on the frontend.    :param question: The question the user is asking that was asked via the frontend input text box.    :param documents: The response from the Kendra document retrieve query, used as context to generate a better    answer.    :return: Returns the final answer that will be provided to the end-user of the application who asked the original    question.    """    # Setup Bedrock client    bedrock = boto3.client('bedrock-runtime')    # configure model specifics such as specific model    modelId = modelId    # body of data with parameters that is passed into the bedrock invoke model request    body = json.dumps({"max_tokens": 350,            "system": "You are a truthful AI assistant. Your goal is to provide informative and substantive responses to queries based on the documents provided. If you do not know the answer to a question, you truthfully say you do not know.",            "messages": [{"role": "user", "content": "Answer this user query:" + question + "with the following context:" + context}],            "anthropic_version": "bedrock-2023-05-31",                "temperature":0,            "top_k":250,            "top_p":0.999})    # Invoking the bedrock model with your specifications    response = bedrock.invoke_model(body=body,                                    modelId=modelId)    # the body of the response that was generated    response_body = json.loads(response.get('body').read())    # retrieving the specific completion field, where you answer will be    answer = response_body.get('content')    # returning the answer as a final result, which ultimately gets returned to the end user    return answer

When selecting websites to index, adhere to the AWS Acceptable Use Policy and other AWS terms. Remember that you can only use Amazon Kendra Web Crawler to index your own web pages or web pages that you have authorization to index. Visit the Amazon Kendra Web Crawler data source guide to learn more about using the web crawler as a data source. Using Amazon Kendra Web Crawler to aggressively crawl websites or web pages you don’t own is not considered acceptable use.

Supported features

The chat assistant supports the following features:

claude-v2

claude-3-haiku-20240307-v1:0

claude-instant-v1

claude-3-sonnet-20240229-v1:0

VERY_HIGH

HIGH

VERY_HIGH

Prerequisites

To perform the solution, you need to have following prerequisites:

AWS account

Step 1: Create your first S3 bucket

Amazon S3 User Guide

You’ll need authorization to crawl and index any websites provided

AWS CloudFormation

Build a generative AI Slack chat assistant

To build a Slack application, use the following steps:

Request model access on Amazon Bedrock

us-east-1

us-east-1

Stack name

S3BucketName

S3KendraFAQKey

SampleFAQs

S3LexBotKey

SeedUrls

Next

Configure stack options

Submit

${YourStackName}-AIBot

Intents

Version 1

Test

AIBotProdAlias

Confirm

AIBotProdAlias

Versioning and Aliases

_question	_answer	_source_uri
Which AWS service has 11 nines of durability?	Amazon S3	https://aws.amazon.com/s3/
What is the AWS Well-Architected Framework?	The AWS Well-Architected Framework enables customers and partners to review their architectures using a consistent approach and provides guidance to improve designs over time.	https://aws.amazon.com/architecture/well-architected/
In what Regions is Amazon Kendra available?	Amazon Kendra is currently available in the following AWS Regions: Northern Virginia, Oregon, and Ireland	https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/

“Which AWS service has 11 nines of durability?”

“What are agents in Amazon Bedrock?”

Integrating an Amazon Lex V2 bot with Slack

AIBotProdAlias

Alias

Channel Integrations

Run sample queries to test the solution

Apps

Manage

Browse apps

${AIBot}

App Directory

“Which AWS service has 11 nines of durability?”

“What is the AWS Well-Architected Framework?”

“What are agents in Amazon Bedrock?”

“What is amazon polly?”

These examples show how the chat assistant retrieves documents from Amazon Kendra and provides answers based on the documents retrieved. If no relevant documents are found, the chat assistant responds with “No relevant documents found.”

Clean up

To clean up the resources created by this solution:

Delete

Conclusion

This post describes the development of a generative AI Slack application powered by Amazon Bedrock and Amazon Kendra. This is designed to be an internal-facing Slack chat assistant that helps answer questions related to the indexed content. The solution architecture includes Amazon Lex for intent identification, a Lambda function for fulfilling the fallback intent, Amazon Kendra for FAQ searches and indexing crawled web pages, and Amazon Bedrock for generating responses. The post walks through the deployment of the solution using a CloudFormation template, provides instructions for running sample queries, and discusses the steps for cleaning up the resources. Overall, this post demonstrates how to use various AWS services to build a powerful generative AI–powered chat assistant application.

This solution demonstrates the power of generative AI in building intelligent chat assistants and search assistants. Explore the generative AI Slack chat assistant: Invite your teams to a Slack workspace and start getting answers to your indexed content and FAQs. Experiment with different use cases and see how you can harness the capabilities of services like Amazon Bedrock and Amazon Kendra to enhance your business operations. For more information about using Amazon Bedrock with Slack, refer to Deploy a Slack gateway for Amazon Bedrock.

About the authors

Kruthi Jayasimha Rao is a Partner Solutions Architect with a focus on AI and ML. She provides technical guidance to AWS Partners in following best practices to build secure, resilient, and highly available solutions in the AWS Cloud.

Mohamed Mohamud is a Partner Solutions Architect with a focus on Data Analytics. He specializes in streaming analytics, helping partners build real-time data pipelines and analytics solutions on AWS. With expertise in services like Amazon Kinesis, Amazon MSK, and Amazon EMR, Mohamed enables data-driven decision-making through streaming analytics.