Create an end-to-end serverless digital assistant for semantic search with Amazon Bedrock

With the rise of generative artificial intelligence (AI), an increasing number of organizations use digital assistants to have their end-users ask domain-specific questions, using Retrieval Augmented Generation (RAG) over their enterprise data sources.

As organizations transition from proofs of concept to production workloads, they establish objectives to run and scale their workloads with minimal operational overhead, while optimizing on costs. Organizations also require the implementation of common security practices such as identity and access management, to make sure that only authorized and authenticated users are allowed to perform specific actions or access specific resources.

This post covers a solution to create an end-to-end digital assistant as a web application using a serverless architecture to address these requirements. Because the solution components primarily use serverless technologies, it provides several benefits, such as automatic scaling, built-in high availability, and a pay-per-use billing model to optimize on costs. The solution also includes an authentication layer and an authorization layer to manage identities and permissions.

This solution also uses the hybrid search feature of Knowledge Bases for Amazon Bedrock to increase the relevancy of retrieved results using RAG. When receiving a query from an end-user, hybrid search performs both a semantic search and a keyword search:

A semantic search provides results based on the meaning and intent within the query A keyword search provides results based on specific entities in a query such as product codes or acronyms

For example, if a user submits a prompt that includes keywords, a text-based search may provide better results than a semantic search. This is why hybrid search combines the two approaches: the precision of semantic search and coverage of keywords. For more information about hybrid search, see Knowledge Bases for Amazon Bedrock now supports hybrid search.

In this post, we provide an operational overview of the solution, and then describe how to set it up with the following services:

Amazon Bedrock

knowledge base

Amazon Bedrock FAQs

Amazon OpenSearch Serverless

Amazon Simple Storage Service

Solution overview

The solution architecture involves the following steps:

Amazon Bedrock API

vector engine for OpenSearch Serverless

After Step 9, the foundation model generates a response back that will be returned to the user in the web application’s digital assistant.

The following diagram illustrates this workflow.

Prerequisites

To follow along and set up this solution, you must have the following:

AWS Amplify CLI set up

Model access

Titan Embeddings G1 – Text

Claude Instant

Upload documents and create a knowledge base

In this section, we create a knowledge base in Amazon Bedrock. The knowledge base will enrich the prompt submitted to an Amazon Bedrock foundation model with contextual information derived from our data source (in our case, documents uploaded in a S3 bucket).

During the creation of the knowledge base, a vector store will also be created to ingest documents encoded as vectors, using an embeddings model. An embeddings model encodes data as vectors in order to capture the meaning and context of our sample documents. This allows us to find data relevant to our end-user prompts.

For our use case, we use the vector engine for OpenSearch Serverless as a vector store and Titan Text Embeddings G1 model as the embeddings model.

Complete the following steps to create an S3 bucket to upload documents, and synchronize them with a knowledge base in Amazon Bedrock:

Create an S3 bucket

Upload

Overview of Amazon Web Services

AWS Well-Architected Framework

Implementing Microservices on AWS

Create a knowledge base

Knowledge base name

assistant-knowledgebase

Knowledge base description

Knowledge base for digital assistant

IAM permissions

Create and use a new service role

Data source name

assistant-knowledgebase-datasource

S3 URI

s3://#s3-bucket-name#

Embeddings model

Titan G1 Embeddings – Text

Vector database

Quick create a new vector store

Ingest

Create the API and backend

In this section, we create the following resources:

Amazon Cognito user pool

Complete the following steps to create the API and the backend of the digital assistant’s web application, using AWS CloudFormation templates:

GitHub repository

api

webapp-userpool-stack.yml

webapp-lambda-stack.yml

webapp-api-stack.yml

lambda-auth

lambda-knowledgebase

cognito-create-testuser.sh

AWS Command Line Interface

aws cloudformation create-stack --stack-name webapp-userpool-stack --template-body file://webapp-userpool-stack.yml

lambda-knowledgebase

pip install -r requirements.txt -t .

lambda-knowledgebase.zip

api

lambda-auth

pip install -r requirements.txt -t .

lambda-auth.zip

Create an S3 bucket

Upload

lambda-auth.zip

lambda-knowledgebase.zip

api

aws cloudformation create-stack \--stack-name webapp-lambda-knowledgebase-stack \--capabilities "CAPABILITY_IAM" \--template-body file://webapp-lambda-knowledgebase-stack.yml \--parameters ParameterKey=BedrockKnowledgeBaseId,ParameterValue=#bedrock-knowledgebase-id# \ParameterKey=BedrockLambdaS3Bucket,ParameterValue=#lambdacode-s3-bucket-name# \ParameterKey=BedrockLambdaS3Key,ParameterValue=lambda-knowledgebase.zip

You can retrieve the knowledge base ID by running the following AWS CLI command:

aws bedrock-agent list-knowledge-bases \--output text \--query 'knowledgeBaseSummaries[?name==`assistant-knowledgebase`].knowledgeBaseId'

Create the API of the web application using the following AWS CLI command (provide your bucket name):

aws cloudformation create-stack \--stack-name webapp-api-stack \--capabilities "CAPABILITY_IAM" \--template-body file://webapp-api-stack.yml \--parameters ParameterKey=LambdaAuthorizerS3Bucket,ParameterValue=#lambdacode-s3-bucket-name# \ParameterKey=LambdaAuthorizerS3Key,ParameterValue=lambda-auth.zip

Configure the Amazon Cognito user pool

In this section, we create a user in our Amazon Cognito user pool. This user will be used to log in to our web application.

Complete the following steps to configure the Amazon Cognito user pool created in the previous section:

webapp-userpool

Users

Create a user

Invitation message

Send an email invitation

Email address

Mark email address as verified

Temporary password

Generate a password

Create user

You can also complete these steps by running the script cognito-create-testuser.sh available in the api folder as follows (provide your email address):

./cognito-create-testuser.sh #your-email-address#

After you create the user, you should receive an email with a temporary password in this format: “Your username is #your-email-address# and temporary password is #temporary-password#.”

Keep note of these login details (email address and temporary password) to use later when testing the web application.

Create the web application

In this section, we build a web application using Amplify and publish it to make it accessible through an endpoint URL. To complete this section, you must first install and set up the Amplify CLI, as discussed in the prerequisites.

Complete the following steps to create the web application of the digital assistant:

frontend

amplify-setup.sh

./amplify-setup.sh

The amplify-setup.sh script creates an Amplify application and configures it to integrate with resources you created in the previous modules:

The Amazon Cognito user pool to authenticate our user through the web application’s login page The Amazon API Gateway to process prompts submitted using the web application’s chat interface

amplify add hosting

Select the plugin module to execute

Hosting with Amplify Console (Managed hosting with custom domains, Continuous deployment)

Choose a type

Manual deployment

In this step, we configure how the web application will be deployed and hosted:

The web application will be hosted using the Amplify console, which offers fully managed hosting The web application will be deployed using manual deployment, which allows us to publish our web application to the Amplify console without connecting a Git provider

amplify publish --yes

The web application is now available for testing and a URL should be displayed, as shown in the following screenshot. Take note of the URL to use in the following section.

Test the digital assistant

In this section, you test the web application of the digital assistant:

Sign in

Change Password

What is the OPS number related to health of operations in the Well Architected framework?

You should receive a response along with sources, as shown in the following screenshot

Clean up

To make sure that no additional cost is incurred, remove the resources provisioned in your account. Make sure you’re in the correct AWS account before deleting the following resources.

Delete the knowledge base

aws cloudformation delete-stack --stack-name webapp-api-stack --region #region#aws cloudformation delete-stack --stack-name webapp-lambda-knowledgebase-stack --region #region#aws cloudformation delete-stack --stack-name webapp-userpool-stack --region #region#

aws amplify delete-app --app-id #app-id# --region #region#

aws amplify list-apps --query 'apps[?name==`frontend`].appId'

Delete the S3 buckets

You should exercise caution when performing the preceding steps. Make sure you are deleting the resources in the correct AWS account.

Conclusion

In this post, we walked through a solution to create a digital assistant using serverless services. First, we created a knowledge base and ingested documents into it from an S3 bucket. Then we created an API and a Lambda function to submit prompts to the knowledge base. We also configured a user pool to grant a user access to the digital assistant’s web application. Finally, we created the frontend of the web application in Amplify.

For further information on the services used, consult the Amazon Bedrock, Security in Amazon Bedrock, Amazon OpenSearch Serverless, AWS Amplify, Amazon API Gateway, AWS Lambda, Amazon Cognito, and Amazon S3 product pages.

To dive deeper into this solution, a self-paced workshop is available in AWS Workshop Studio, at this location.

About the author

Mehdi Amrane is a Senior Solutions Architect at Amazon Web Services. He supports customers on their initiatives and provides them prescriptive guidance to achieve their goals, and accelerate their cloud journey. He is passionate about creating content on application architecture, DevOps and Serverless technologies.