Build a conversational chatbot using different LLMs within single interface

With the advent of generative artificial intelligence (AI), foundation models (FMs) can generate content such as answering questions, summarizing text, and providing highlights from the sourced document. However, for model selection, there is a wide choice from model providers, like Amazon, Anthropic, AI21 Labs, Cohere, and Meta, coupled with discrete real-world data formats in PDF, Word, text, CSV, image, audio, or video.

Amazon Bedrock is a fully managed service that makes it straightforward to build and scale generative AI applications. Amazon Bedrock offers a choice of high-performing FMs from leading AI companies, including AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon, through a single API. It enables you to privately customize FMs with your data using techniques such as fine-tuning, prompt engineering, and Retrieval Augmented Generation (RAG), and build agents that run tasks using your enterprise systems and data sources while complying with security and privacy requirements.

In this post, we show you a solution for building a single interface conversational chatbot that allows end-users to choose between different large language models (LLMs) and inference parameters for varied input data formats. The solution uses Amazon Bedrock to create choice and flexibility to improve the user experience and compare the model outputs from different options.

The entire code base is available in GitHub, along with an AWS CloudFormation template.

What is RAG

Retrieval Augmented Generation (RAG) can enhance the generation process by using the benefits of retrieval, enabling a natural language generation model to produce more informed and contextually appropriate responses. By incorporating relevant information from retrieval into the generation process, RAG aims to improve the accuracy, coherence, and informativeness of the generated content.

Implementing an effective RAG system requires several key components working in harmony:

Foundation models

Vector store

Amazon OpenSearch Service

Amazon Aurora PostgreSQL-Compatible Edition

Amazon Relational Database Service (Amazon RDS) for PostgreSQL

Amazon DocumentDB (with MongoDB compatibility)

Amazon Neptune ML

Amazon MemoryDB for Redis

Retriever

Embedder

Document ingestion

LangChain framework

We have fully managed support for our end-to-end RAG workflow using Knowledge Bases for Amazon Bedrock. With Knowledge Bases for Amazon Bedrock, you can give FMs and agents contextual information from your company’s private data sources for RAG to deliver more relevant, accurate, and customized responses.

To equip FMs with up-to-date and proprietary information, organizations use RAG to fetch data from company data sources and enrich the prompt to provide more relevant and accurate responses. Knowledge Bases for Amazon Bedrock is a fully managed capability that helps you implement the entire RAG workflow, from ingestion to retrieval and prompt augmentation, without having to build custom integrations to data sources and manage data flows. Session context management is built in, so your app can readily support multi-turn conversations.

Solution overview

This chatbot is built using RAG, enabling it to provide versatile conversational abilities. The following figure illustrates a sample UI of the Q&A interface using Streamlit and the workflow.

This post provides a single UI with multiple choices for the following capabilities:

Text (PDF, CSV, Word) Website link YouTube video Audio Scanned image PowerPoint

Q&A Summary: summarize, get highlights, extract text

We have used one of LangChain’s many document loaders, YouTubeLoader. The from_you_tube_url function helps extract transcripts and metadata from the YouTube video.

The documents contain two attributes:

page_content

metadata

Text is extracted from the transcript and using Langchain TextLoader, the document is split and chunked, and embeddings are created, which are then stored in the vector store.

The following diagram illustrates the solution architecture.

Prerequisites

To implement this solution, you should have the following prerequisites:

AWS account

Amazon Elastic Compute Cloud

Amazon Titan Text Embedding

Model access

Manage model access

Titan Multimodal Embeddings G1

Request model access

Deploy the solution

The CloudFormation template deploys an Amazon Elastic Compute Cloud (Amazon EC2) instance to host the Streamlit application, along with other associated resources like an AWS Identity and Access Management (IAM) role and Amazon Simple Storage Service (Amazon S3) bucket. For more information about Amazon Bedrock and IAM, refer to How Amazon Bedrock Works with IAM.

In this post, we deploy the Streamlit application over an EC2 instance inside a VPC, but you can deploy it as a containerized application using a serverless solution with AWS Fargate. We discuss this in more detail in Part 2.

Complete the following steps to deploy the solution resources using AWS CloudFormation:

StreamlitAppServer_Cfn.yml

GitHub repo

Prepare template

Template is ready

Specify template

Template source

Upload a template file

file

Next

For Stack name

StreamlitAppServer

Parameters

Specify the VPC ID

VPCCidr

SubnetID

MYIPCidr

You can run the command curl https://api.ipify.org on your local terminal to get your IP address.

Next

Capabilities

Submit

Wait until you see the stack status show as CREATE_COMPLETE.

Resources

Create folder

Folder name

gen-ai-qa

Make sure to follow AWS security best practices for securing data in Amazon S3. For more details, see Top 10 security best practices for securing data in Amazon S3.

Resources

StreamlitApp_Sever

Connect.

This will open a new page with various ways to connect to the EC2 instance launched.

Connect

EC2 Instance Connect

Connect

This will open an Amazon EC2 session in your browser.

Run the following command to monitor the progress of all the Python-related libraries being installed as part of the user data:

tail -f /tmp/userData.log

Finished running user data...

Ctrl + C

This takes about 15 minutes to complete.

Run the following commands to start the application:

cd $HOME/bedrock-qnachatbotbucket_name=$(aws cloudformation describe-stacks --stack-name StreamlitAppServer --query "Stacks[0].Outputs[?starts_with(OutputKey, 'BucketName')].OutputValue" --output text)TOKEN=$(curl -s -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600")aws_region_name=$(curl -s http://169.254.169.254/latest/meta-data/placement/region -H "X-aws-ec2-metadata-token: $TOKEN")sed -i "s/<S3_Bucket_Name>/${bucket_name}/g" $HOME/bedrock-qnachatbot/src/utils.pysed -i "s/<AWS_Region>/${aws_region_name}/g" $HOME/bedrock-qnachatbot/src/utils.pyexport AWS_DEFAULT_REGION=${aws_region_name}streamlit run src/1_?_Home.py

Make a note of the External URL value. If by any chance you exit of the session (or application is stopped), you can restart the application by running the same command as highlighted in Step # 18

Use the chatbot

Use the external URL you copied in the previous step to access the application.

You can upload your file to start using the chatbot for Q&A.

Clean up

To avoid incurring future charges, delete the resources that you created:

Empty the contents of the S3 bucket

Delete the CloudFormation stack

Conclusion

In this post, we showed you how to create a Q&A chatbot that can answer questions across an enterprise’s corpus of documents with choices of FM available within Amazon Bedrock—within a single interface.

In Part 2, we show you how to use Knowledge Bases for Amazon Bedrock with enterprise-grade vector databases like OpenSearch Service, Amazon Aurora PostgreSQL, MongoDB Atlas, Weaviate, and Pinecone with your Q&A chatbot.

About the Authors

Anand Mandilwar is an Enterprise Solutions Architect at AWS. He works with enterprise customers helping customers innovate and transform their business in AWS. He is passionate about automation around Cloud operation , Infrastructure provisioning and Cloud Optimization. He also likes python programming. In his spare time, he enjoys honing his photography skill especially in Portrait and landscape area.

NagaBharathi Challa is a solutions architect in the US federal civilian team at Amazon Web Services (AWS). She works closely with customers to effectively use AWS services for their mission use cases, providing architectural best practices and guidance on a wide range of services. Outside of work, she enjoys spending time with family & spreading the power of meditation.

What is RAG

Solution overview

Prerequisites

Deploy the solution

Use the chatbot

Clean up

Conclusion

About the Authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签