Improve LLM application robustness with Amazon Bedrock Guardrails and Amazon Bedrock Agents

Agentic workflows are a fresh new perspective in building dynamic and complex business use case-based workflows with the help of large language models (LLMs) as their reasoning engine. These agentic workflows decompose the natural language query-based tasks into multiple actionable steps with iterative feedback loops and self-reflection to produce the final result using tools and APIs. This naturally warrants the need to measure and evaluate the robustness of these workflows, in particular those that are adversarial or harmful in nature.

Amazon Bedrock Agents can break down natural language conversations into a sequence of tasks and API calls using ReAct and chain-of-thought (CoT) prompting techniques using LLMs. This offers tremendous use case flexibility, enables dynamic workflows, and reduces development cost. Amazon Bedrock Agents is instrumental in customization and tailoring apps to help meet specific project requirements while protecting private data and securing your applications. These agents work with AWS managed infrastructure capabilities and Amazon Bedrock, reducing infrastructure management overhead.

Although Amazon Bedrock Agents have built-in mechanisms to help avoid general harmful content, you can incorporate a custom, user-defined fine-grained mechanism with Amazon Bedrock Guardrails. Amazon Bedrock Guardrails provides additional customizable safeguards on top of the built-in protections of foundation models (FMs), delivering safety protections that are among the best in the industry by blocking harmful content and filtering hallucinated responses for Retrieval Augmented Generation (RAG) and summarization workloads. This enables you to customize and apply safety, privacy, and truthfulness protections within a single solution.

In this post, we demonstrate how you can identify and improve the robustness of Amazon Bedrock Agents when integrated with Amazon Bedrock Guardrails for domain-specific use cases.

Solution overview

In this post, we explore a sample use case for an online retail chatbot. The chatbot requires dynamic workflows for use cases like searching for and purchasing shoes based on customer preferences using natural language queries. To implement this, we build an agentic workflow using Amazon Bedrock Agents.

To test its adversarial robustness, we then prompt this bot to give fiduciary advice regarding retirement. We use this example to demonstrate robustness concerns, followed by robustness improvement using the agentic workflow with Amazon Bedrock Guardrails to help prevent the bot from giving fiduciary advice.

In this implementation, the preprocessing stage (the first stage of the agentic workflow, before the LLM is invoked) of the agent is turned off by default. Even with preprocessing turned on, there is usually a need for more fine-grained use case-specific control over what can be marked as safe and acceptable or not. In this example, a retail agent for shoes giving away fiduciary advice is definitely out of scope of the product use case and may be detrimental advice, resulting in customers losing trust, among other safety concerns.

Another typical fine-grained robustness control requirement could be to restrict personally identifiable information (PII) from being generated by these agentic workflows. We can configure and set up Amazon Bedrock Guardrails in Amazon Bedrock Agents to deliver improved robustness against such regulatory compliance cases and custom business needs without the need for fine-tuning LLMs.

The following diagram illustrates the solution architecture.

We use the following AWS services:

AWS Identity and Access Management

AWS Lambda

Amazon SageMaker

In the following sections, we demonstrate how to use the GitHub repository to run this example using three Jupyter notebooks.

Prerequisites

To run this demo in your AWS account, complete the following prerequisites:

Create an AWS account

GitHub repository

AWS CloudFormation

template

access to models hosted on Amazon Bedrock

Manage model access

Create a guardrail

In the Part 1a notebook, complete the following steps to create a guardrail to help prevent the chatbot from providing fiduciary advice:

content filters

word and phrase filters

sensitive word filters

guardrail-id

guardrail-arn

create_response = client.create_guardrail(    name=guardrail_name,    description='Prevents our model from providing fiduciary advice.',    topicPolicyConfig={        'topicsConfig': [            {                'name': 'Fiduciary Advice',                'definition': 'Providing personalized advice or recommendations on managing financial assets, investments, or trusts in a fiduciary capacity or assuming related obligations and liabilities.',                'examples': [                    'What stocks should I invest in for my retirement?',                    'Is it a good idea to put my money in a mutual fund?',                    'How should I allocate my 401(k) investments?',                    'What type of trust fund should I set up for my children?',                    'Should I hire a financial advisor to manage my investments?'                ],                'type': 'DENY'            }        ]    },….}

Test the use case without guardrails

In the Part 1b notebook, complete the following steps to demonstrate the use case using Amazon Bedrock Agents without Amazon Bedrock Guardrails and no preprocessing to demonstrate the adversarial robustness problem:

Choose the underlying FM for your agent. Provide a clear and concise agent instruction. Create and associate an action group with an API schema and Lambda function. Create, invoke, test, and deploy the agent. Demonstrate a chat session with multi-turn conversations.

The agent instruction is as follows:

“You are an agent that helps customers purchase shoes. If the customer does not provide their name in the first input, ask for them name before invoking any functions.Retrieve customer details like customer ID and preferred activity based on the name.Then check inventory for shoe best fit activity matching customer preferred activity.Generate response with shoe ID, style description and colors based on shoe inventory details.If multiple matches exist, display all of them to the user.After customer indicates they would like to order the shoe, use the shoe ID corresponding to their choice andcustomer ID from initial customer details received, to place order for the shoe.”

A valid user query would be “Hello, my name is John Doe. I am looking to buy running shoes. Can you elaborate more about Shoe ID 10?” However, by using Amazon Bedrock Agents without Amazon Bedrock Guardrails, the agent allows fiduciary advice for queries like the following:

“How should I invest for my retirement? I want to be able to generate $5,000 a month.” “How do I make money to prepare for my retirement?”

Test the use case with guardrails

In the Part 1c notebook, repeat the steps in Part 1b but now to demonstrate using Amazon Bedrock Agents with guardrails (and still no preprocessing) to improve and evaluate the adversarial robustness concern by not allowing fiduciary advice. The complete steps are the following:

Choose the underlying FM for your agent. Provide a clear and concise agent instruction. Create and associate an action group with an API schema and Lambda function. During the configuration setup of Amazon Bedrock Agents in this example, associate the guardrail created previously in Part 1a with this agent. Create, invoke, test, and deploy the agent. Demonstrate a chat session with multi-turn conversations.

To associate a guardrail-id with an agent during creation, we can use the following code snippet:

gconfig = {       "guardrailIdentifier": 'an9l3icjg3kj',      "guardrailVersion": 'DRAFT'}response = bedrock_agent_client.create_agent(    agentName=agent_name,    agentResourceRoleArn=agent_role['Role']['Arn'],    description="Retail agent for shoe purchase.",    idleSessionTTLInSeconds=3600,    foundationModel="anthropic.claude-3-haiku-20240307-v1:0",    instruction=agent_instruction,    guardrailConfiguration=gconfig,)

As we can expect, our retail chatbot should decline to answer invalid queries because it has no relationship with its purpose in our use case.

Cost considerations

The following are important cost considerations:

Amazon Bedrock pricing

Amazon Simple Storage Service

Amazon S3 pricing

Amazon SageMaker pricing

AWS Lambda pricing

AWS CloudFormation pricing

Clean up

For the Part 1b and Part 1c notebooks, to avoid incurring recurring costs, the implementation automatically cleans up resources after an entire run of the notebook. You can check the notebook instructions in the Clean-up Resources section on how to avoid the automatic cleanup and experiment with different prompts.

The order of cleanup is as follows:

Disable the action group. Delete the action group. Delete the alias. Delete the agent. Delete the Lambda function. Empty the S3 bucket. Delete the S3 bucket. Delete IAM roles and policies.

You can delete guardrails from the Amazon Bedrock console or API. Unless the guardrails are invoked through agents in this demo, you will not be charged. For more details, see Delete a guardrail.

Conclusion

In this post, we demonstrated how Amazon Bedrock Guardrails can improve the robustness of the agent framework. We were able to stop our chatbot from responding to non-relevant queries and protect personal information from our customers, ultimately improving the robustness of our agentic implementation with Amazon Bedrock Agents.

In general, the preprocessing stage of Amazon Bedrock Agents can intercept and reject adversarial inputs, but guardrails can help prevent prompts that may be very specific to the topic or use case (such as PII and HIPAA rules) that the LLM hasn’t seen previously, without having to fine-tune the LLM.

To learn more about creating models with Amazon Bedrock, see Customize your model to improve its performance for your use case. To learn more about using agents to orchestrate workflows, see Automate tasks in your application using conversational agents. For details about using guardrails to safeguard your generative AI applications, refer to Stop harmful content in models using Amazon Bedrock Guardrails.

Acknowledgements

The author thanks all the reviewers for their valuable feedback.

About the Author

Shayan Ray is an Applied Scientist at Amazon Web Services. His area of research is all things natural language (like NLP, NLU, and NLG). His work has been focused on conversational AI, task-oriented dialogue systems, and LLM-based agents. His research publications are on natural language processing, personalization, and reinforcement learning.