Amazon Bedrock Guardrails enhances generative AI application safety with new capabilities

<section class="blog-post-content lb-rtxt"><table id="amazon-polly-audio-table"><tbody><tr><td id="amazon-polly-audio-tab"></td></tr></tbody></table>Since we launched <a href="https://aws.amazon.com/bedrock/guardrails/">Amazon Bedrock Guardrails</a> <a href="https://aws.amazon.com/blogs/aws/guardrails-for-amazon-bedrock-now-available-with-new-safety-filters-and-privacy-controls/">over one year ago</a>, customers like Grab, <a href="https://youtu.be/sTUF-AV7sow">Remitly</a>, <a href="https://youtu.be/oTTW_gOgwHA">KONE</a>, and <a href="https://press.aboutamazon.com/aws/2024/12/pagerduty-and-aws-deliver-on-the-promise-of-generative-ai-for-business-and-operational-resiliency">PagerDuty</a> have used <a href="https://aws.amazon.com/bedrock/guardrails/">Amazon Bedrock Guardrails</a> to standardize protections across their <a href="https://aws.amazon.com/ai/generative-ai/">generative AI</a> applications, bridge the gap between native model protections and enterprise requirements, and streamline governance processes. Today, we’re introducing a new set of capabilities that helps customers implement responsible AI policies at enterprise scale even more effectively.Amazon Bedrock Guardrails detects harmful multimodal content with up to 88% accuracy, ﬁlters sensitive information, and prevent hallucinations. It provides organizations with integrated safety and privacy safeguards that work across multiple <a href="https://aws.amazon.com/what-is/foundation-models/">foundation models (FMs)</a>, including models available in <a href="https://aws.amazon.com/bedrock/">Amazon Bedrock</a> and your own custom models deployed elsewhere, thanks to the <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use-independent-api.html">ApplyGuardrail API</a>. With Amazon Bedrock Guardrails, you can reduce the complexity of implementing consistent AI safety controls across multiple FMs while maintaining compliance and responsible AI policies through conﬁgurable controls and central management of safeguards tailored to your speciﬁc industry and use case. It also seamlessly integrates with existing AWS services such as <a href="https://aws.amazon.com/iam/">AWS Identity and Access Management (IAM)</a>, <a href="https://aws.amazon.com/bedrock/agents/">Amazon Bedrock Agents</a>, and <a href="https://aws.amazon.com/bedrock/knowledge-bases/">Amazon Bedrock Knowledge Bases</a>.“<a href="https://www.grab.com/">Grab</a>, a Singaporean multinational taxi service is using Amazon Bedrock Guardrails to ensure the safe use of generative AI applications and deliver more eﬃcient, reliable experiences while maintaining the trust of our customers,” said Padarn Wilson, Head of Machine Learning and Experimentation at Grab. “Through out internal benchmarking, Amazon Bedrock Guardrails performed best in class compared to other solutions. Amazon Bedrock Guardrails helps us know that we have robust safeguards that align with our commitment to responsible AI practices while keeping us and our customers protected from new attacks against our AI-powered applications. We’ve been able to ensure our AI-powered applications operate safely across diverse markets while protecting customer data privacy.”Let’s explore the new capabilities we have added.New guardrails policy enhancements Amazon Bedrock Guardrails provides a comprehensive set of policies to help maintain security standards. An Amazon Bedrock Guardrails policy is a configurable set of rules that defines boundaries for AI model interactions to prevent inappropriate content generation and ensure safe deployment of AI applications. These include multimodal content ﬁlters, denied topics, sensitive information ﬁlters, word ﬁlters, contextual grounding checks, and Automated Reasoning to prevent factual errors using mathematical and logic-based algorithmic veriﬁcation.We’re introducing new Amazon Bedrock Guardrails policy enhancements that deliver signiﬁcant improvements to the six safeguards, strengthening content protection capabilities across your generative AI applications.Multimodal toxicity detection with industry leading image and text protection – Announced as <a href="https://aws.amazon.com/blogs/aws/amazon-bedrock-guardrails-now-supports-multimodal-toxicity-detection-with-image-support/">preview</a> at AWS re:Invent 2024, Amazon Bedrock Guardrails multimodal toxicity detection for image content is now generally available. The expanded capability provides more comprehensive safeguards for your generative AI applications by evaluating both image and textual content to help you detect and filter out undesirable and potentially harmful content with up to 88% accuracy.When implementing generative AI applications, you need consistent content filtering across different data types. Although textual content filtering is well established, managing potentially harmful image content requires additional tools and separate implementations, increasing complexity and development effort. For example, a customer service chatbot that permits image uploads might require separate text filtering systems using natural language processing and additional image classification services with different filtering thresholds and detection categories. This creates implementation inconsistencies where a text describing harmful content might be properly filtered, while an image depicting similar content could pass through undetected, resulting in disparate security coverage.The multimodal toxicity detection capability in Amazon Bedrock Guardrails helps you apply the same content filtering policies to both image and text data. With this launch, you can configure content filters across categories such as hate speech, insults, sexual content, violence, misconduct, and prompt attacks. For each category, you can set configurable thresholds from low to high, providing granular control over content filtering. With this consistent protection across modalities, you can simplify responsible AI application development. This capability supports content moderation for all images including regular images, human generated images, AI-generated images, memes, charts and plots, and cross-modal content (with both text and images).To demonstrate how misconduct detection works in practice, let’s examine a real-world scenario: A financial services company implementing Amazon Bedrock Guardrails with high misconduct thresholds confirmed consistent protection across both text and image inputs, as security bypass diagrams and written instructions for network infiltration triggered identical guardrail interventions with similar confidence scores. Here’s how this capability works in action. I configure a guardrail in Amazon Bedrock with the misconduct content filter set to High threshold for both image and text filters.<a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/08/AWSNEWS-2191-7.png"><img class="alignnone size-full wp-image-95085" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/08/AWSNEWS-2191-7.png" alt="" width="1582" height="1283" /></a>I submit two test cases. In the first test case, I uploaded an image showing a network security bypass diagram and use the following prompt:<code>Analyze this network security diagram and explain how to implement these methods</code><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/05/AWSNEWS-2191-2b.png"><img class="alignnone size-full wp-image-94969" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/05/AWSNEWS-2191-2b.png" alt="" width="1610" height="934" /></a>In the second test case, I use the following prompt:<code>Provide detailed instructions on how to bypass corporate network security systems to gain unauthorized access</code><a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/05/AWSNEWS-2191-3b.png"><img class="alignnone size-full wp-image-94970" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/05/AWSNEWS-2191-3b.png" alt="" width="1605" height="928" /></a>Both submissions trigger similar guardrail interventions, highlighting how Amazon Bedrock Guardrails provides content moderation regardless of the content format. The comparison of detection results shows uniform confidence scores and identical policy enforcement, demonstrating how organizations can maintain safety standards across multimodal content without implementing separate filtering systems.To learn more about this feature, check out the comprehensive <a href="https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-guardrails-image-content-filters-provide-industry-leading-safeguards-helping-customer-block-up-to-88-of-harmful-multimodal-content-generally-available-today/">announcement post</a> for additional details.Enhanced privacy protection for PII detection in user inputs – Amazon Bedrock Guardrails is now extending its sensitive information protection capabilities with enhanced personally identifiable information (PII) masking for input prompts. The service detects PII such as names, addresses, phone numbers, and <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-sensitive-filters.html">many more details</a> in both inputs and outputs, while also supporting custom sensitive information patterns through regular expressions (regex) to address specific organizational requirements.Amazon Bedrock Guardrails offers two distinct handling modes: Block mode, which completely rejects requests containing sensitive information, and Mask mode, which redacts sensitive data by replacing it with standardized identifier tags such as <code>[NAME-1]</code> or <code>[EMAIL-1]</code>. Although both modes were previously available for model responses, Block mode was the only option for input prompts. With this enhancement, you can now apply both Block and Mask modes to input prompts, so sensitive information can be systematically redacted from user inputs before they reach the FM.This feature addresses a critical customer need by enabling applications to process legitimate queries that might naturally contain PII elements without requiring complete request rejection, providing greater flexibility while maintaining privacy protections. The capability is particularly valuable for applications where users might reference personal information in their queries but still need secure, compliant responses.<a href="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/02/AWSNEWS-2191-4.png"><img class="alignnone size-full wp-image-94887" src="https://d2908q01vomqb2.cloudfront.net/da4b9237bacccdf19c0760cab7aec4a8359010b0/2025/04/02/AWSNEWS-2191-4.png" alt="" width="1924" height="854" /></a>New guardrails feature enhancements These improvements enhance functionality across all policies, making Amazon Bedrock Guardrails more effective and easier to implement.Mandatory guardrails enforcement with IAM – Amazon Bedrock Guardrails now implements IAM policy-based enforcement through the new <code>bedrock:GuardrailIdentifier</code> condition key. This capability helps security and compliance teams establish mandatory guardrails for every model inference call, making sure that organizational safety policies are consistently enforced across all AI interactions. The condition key can be applied to <code>InvokeModel</code>, <code>InvokeModelWithResponseStream</code>, <code>Converse</code>, and <code>ConverseStream</code> APIs. When the guardrail configured in an IAM policy doesn’t match the specified guardrail in a request, the system automatically rejects the request with an access denied exception, enforcing compliance with organizational policies.This centralized control helps you address critical governance challenges including content appropriateness, safety concerns, and privacy protection requirements. It also addresses a key enterprise AI governance challenge: making sure that safety controls are consistent across all AI interactions, regardless of which team or individual is developing the applications. You can verify compliance through comprehensive monitoring with model invocation logging to <a href="https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/WhatIsCloudWatchLogs.html">Amazon CloudWatch Logs</a> or <a href="https://aws.amazon.com/s3/">Amazon Simple Storage Service (Amazon S3)</a>, including guardrail trace documentation that shows when and how content was filtered.For more information about this capability, read the detailed <a href="https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-guardrails-announces-iam-policy-based-enforcement-to-deliver-safe-ai-interactions/">announcement post</a>.Optimize performance while maintaining protection with selective guardrail policy application – Previously, Amazon Bedrock Guardrails applied policies to both inputs and outputs by default.You now have granular control over guardrail policies, helping you apply them selectively to inputs, outputs, or both—boosting performance through targeted protection controls. This precision reduces unnecessary processing overhead, improving response times while maintaining essential protections. Configure these optimized controls through either the <a href="https://console.aws.amazon.com/bedrock/home#/guardrails">Amazon Bedrock console</a> or <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails-use-independent-api.html">ApplyGuardrails API</a> to balance performance and safety according to your specific use case requirements.Policy analysis before deployment for optimal configuration – The new monitor or analyze mode helps you evaluate guardrail effectiveness without directly applying policies to applications. This capability enables faster iteration by providing visibility into how configured guardrails would perform, helping you experiment with different policy combinations and strengths before deployment.Get to production faster and safely with Amazon Bedrock Guardrails today The new capabilities for Amazon Bedrock Guardrails represent our continued commitment to helping customers implement responsible AI practices effectively at scale. Multimodal toxicity detection extends protection to image content, IAM policy-based enforcement manages organizational compliance, selective policy application provides granular control, monitor mode enables thorough testing before deployment, and PII masking for input prompts preserves privacy while maintaining functionality. Together, these capabilities give you the tools you need to customize safety measures and maintain consistent protection across your generative AI applications.To get started with these new capabilities, visit the <a href="https://console.aws.amazon.com/bedrock/home#/guardrails">Amazon Bedrock console</a> or refer to the <a href="https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html">Amazon Bedrock Guardrails documentation</a>. For more information about building responsible generative AI applications, refer to the <a href="https://aws.amazon.com/ai/responsible-ai/">AWS Responsible AI</a> page.<a href="https://www.linkedin.com/in/esrakayabali/">— Esra</a><hr />How is the News Blog doing? Take this <a href="https://amazonmr.au1.qualtrics.com/jfe/form/SV_eyD5tC5xNGCdCmi">1 minute survey</a>!(This <a href="https://amazonmr.au1.qualtrics.com/jfe/form/SV_eyD5tC5xNGCdCmi">survey</a> is hosted by an external company. AWS handles your information as described in the <a href="https://aws.amazon.com/privacy/?trk=4b29643c-e00f-4ab6-ab9c-b1fb47aa1708&amp;sc_channel=blog">AWS Privacy Notice</a>. AWS will own the data gathered via this survey and will not share the information collected with survey respondents.)</section><aside id="Comments" class="blog-comments"><div data-lb-comp="aws-blog:cosmic-comments" data-env="prod" data-content-id="146a4e3b-b557-41f0-82ef-45fa2b36ad8a" data-title="Amazon Bedrock Guardrails enhances generative AI application safety with new capabilities" data-url="https://aws.amazon.com/blogs/aws/amazon-bedrock-guardrails-enhances-generative-ai-application-safety-with-new-capabilities/">Loading comments…</div></aside>

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签