AWS Machine Learning Blog 2024年08月01日
Use the ApplyGuardrail API with long-context inputs and streaming outputs in Amazon Bedrock
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Amazon Bedrock 提供了新的 ApplyGuardrail API,允许用户在不调用基础模型的情况下使用预先配置的防护措施评估任何文本。本文重点介绍了如何使用 ApplyGuardrail API 处理长文本输入和流式输出,以实现实时内容安全防护。

📜 **ApplyGuardrail API 概述** ApplyGuardrail API 提供了几个关键功能,包括易用性、与基础模型解耦以及可扩展性。该 API 允许用户在处理数据或向用户提供结果之前,在应用程序流程中的任何位置对其进行集成。由于 API 与基础模型解耦,用户可以将其用于任何模型,包括在 Amazon SageMaker 上托管的模型、自托管模型或来自第三方模型提供商的模型。用户可以根据评估结果设计生成式 AI 应用程序,确保其符合定义的策略和准则。

📢 **流式输出的挑战** 由于大型语言模型(LLM)可以以流式方式生成文本,因此在实时应用防护措施时会遇到挑战。与输入场景不同,流式输出以增量方式生成,难以评估完整上下文和潜在违规行为。在实时评估输出时需要一种机制来持续监控流式输出,并考虑生成的文本的上下文和连贯性。

📥 **流式输出上的防护措施解决方案** 为了解决流式输出上应用防护措施的挑战,需要一种结合批处理和实时评估的策略。该策略涉及将流式输出收集到较小的批次或块中,使用 ApplyGuardrail API 评估每个批次,然后根据评估结果采取适当的措施。该策略通过将输出批处理到较小的块中,可以更频繁地调用 ApplyGuardrail API,从而实现实时评估和决策。批处理过程应旨在维护生成的文本的上下文和连贯性。

📡 **示例用例:将防护措施应用于流式输出** 本文提供了一个示例,说明了如何实现这种策略。首先,创建一个防护措施。然后,使用 ApplyGuardrail API 评估防护措施,并检查潜在的严重违规行为,从而阻止请求。

📣 **未来展望** 随着生成式 AI 应用的普及,维护负责任的 AI 原则变得至关重要。通过实施防护措施,可以确保生成式 AI 应用符合道德原则和法律要求。Amazon Bedrock 提供的 ApplyGuardrail API 为用户提供了强大的工具,可以帮助他们构建安全、可靠和负责任的生成式 AI 应用程序。

As generative artificial intelligence (AI) applications become more prevalent, maintaining responsible AI principles becomes essential. Without proper safeguards, large language models (LLMs) can potentially generate harmful, biased, or inappropriate content, posing risks to individuals and organizations. Applying guardrails helps mitigate these risks by enforcing policies and guidelines that align with ethical principles and legal requirements. Guardrails for Amazon Bedrock evaluates user inputs and model responses based on use case-specific policies, and provides an additional layer of safeguards regardless of the underlying foundation model (FM). Guardrails can be applied across all LLMs on Amazon Bedrock, including fine-tuned models and even generative AI applications outside of Amazon Bedrock. You can create multiple guardrails, each configured with a different combination of controls, and use these guardrails across different applications and use cases. You can configure guardrails in multiple ways, including to deny topics, filter harmful content, remove sensitive information, and detect contextual grounding.

The new ApplyGuardrail API enables you to assess any text using your preconfigured guardrails in Amazon Bedrock, without invoking the FMs. In this post, we demonstrate how to use the ApplyGuardrail API with long-context inputs and streaming outputs.

ApplyGuardrail API overview

The ApplyGuardrail API offers several key features:

You can use the assessment results from the ApplyGuardrail API to design the experience on your generative AI application, making sure it adheres to your defined policies and guidelines.

The ApplyGuardrail API request allows you to pass all your content that should be guarded using your defined guardrails. The source field should be set to INPUT when the content to evaluated is from a user, typically the LLM prompt. The source should be set to OUTPUT when the model output guardrails should be enforced, typically an LLM response. An example request looks like the following code:

{    "source": "INPUT" | "OUTPUT",    "content": [{        "text": {            "text": "This is a sample text snippet...",        }    }]}

For more information about the API structure, refer to Guardrails for Amazon Bedrock.

Streaming output

LLMs can generate text in a streaming manner, where the output is produced token by token or word by word, rather than generating the entire output at once. This streaming output capability is particularly useful in scenarios where real-time interaction or continuous generation is required, such as conversational AI assistants or live captioning. Incrementally displaying the output allows for a more natural and responsive user experience. Although it’s advantageous in terms of responsiveness, streaming output introduces challenges when it comes to applying guardrails in real time as the output is generated. Unlike the input scenario, where the entire text is available upfront, the output is generated incrementally, making it difficult to assess the complete context and potential violations.

One of the main challenges is the need to evaluate the output as it’s being generated, without waiting for the entire output to be complete. This requires a mechanism to continuously monitor the streaming output and apply guardrails in real time, while also considering the context and coherence of the generated text. Furthermore, the decision to halt or continue the generation process based on the guardrail assessment needs to be made in real time, which can impact the responsiveness and user experience of the application.

Solution overview: Use guardrails on streaming output

To address the challenges of applying guardrails on streaming output from LLMs, a strategy that combines batching and real-time assessment is required. This strategy involves collecting the streaming output into smaller batches or chunks, evaluating each batch using the ApplyGuardrail API, and then taking appropriate actions based on the assessment results.

The first step in this strategy is to batch the streaming output chunks into smaller batches that are closer to a text unit, which is approximately 1,000 characters. If a batch is smaller, such as 600 characters, you’re still charged for an entire text unit (1,000 characters). For a cost-effective usage of the API, it’s recommended that the batches of chunks are in order of text units, such as 1,000 characters, 2,000, and so on. This way, you minimize the risk of incurring unnecessary costs.

By batching the output into smaller batches, you can invoke the ApplyGuardrail API more frequently, allowing for real-time assessment and decision-making. The batching process should be designed to maintain the context and coherence of the generated text. This can be achieved by making sure the batches don’t split words or sentences, and by carrying over any necessary context from the previous batch. Though the chunking varies between use cases, for the sake of simplicity, this post showcases simple character-level chunking, but it’s recommended to explore options such as semantic chunking or hierarchical chunking while still adhering to the guidelines mentioned in this post.

After the streaming output has been batched into smaller chunks, each chunk can be passed to the API for evaluation. The API will assess the content of each chunk against the defined policies and guidelines, identifying any potential violations or sensitive information.

The assessment results from the API can then be used to determine the appropriate action for the current batch. If a severe violation is detected, the API assessment suggests halting the generation process, and instead a preset message or response can be displayed to the user. However, in some cases, no severe violation is detected, but the guardrail was configured to pass through the request, for example in the case of sensitiveInformationPolicyConfig to anonymize the detected entities instead of blocking. If such an intervention occurs, the output will be masked or modified accordingly before being displayed to the user. For latency-sensitive applications, you can also consider creating multiple buffers and multiple guardrails, each with different policies, and then processing them with the ApplyGuardrail API in parallel. This way, you can minimize the time it takes to make assessments for one guardrail at a time, but maximize getting the assessments from multiple guardrails and multiple batches, though this technique hasn’t been implemented in this example.

Example use case: Apply guardrails to streaming output

In this section, we provide an example of how such a strategy could be implemented. Let’s begin with creating a guardrail. You can use the following code sample to create a guardrail in Amazon Bedrock:

import boto3REGION_NAME = "us-east-1"bedrock_client = boto3.client("bedrock", region_name=REGION_NAME)bedrock_runtime = boto3.client("bedrock-runtime", region_name=REGION_NAME)response = bedrock_client.create_guardrail(    name="<name>",    description="<description>",    ...)# alternatively provide the id and version for your own guardrailguardrail_id = response['guardrailId'] guardrail_version = response['version']

Proper assessment of the policies must be conducted to verify if the input should be later sent to an LLM or whether the output generated by the LLM should be displayed to the user. In the following code, we examine the assessments, which are part of the response from the ApplyGuardrail API, for potential severe violation leading to BLOCKED intervention by the guardrail:

from typing import List, Dictdef check_severe_violations(violations: List[Dict]) -> int:    """    When guardrail intervenes either the action on the request is BLOCKED or NONE.    This method checks the number of the violations leading to blocking the request.    Args:        violations (List[Dict]): A list of violation dictionaries, where each dictionary has an 'action' key.    Returns:        int: The number of severe violations (where the 'action' is 'BLOCKED').    """    severe_violations = [violation['action']=='BLOCKED' for violation in violations]    return sum(severe_violations)def is_policy_assessement_blocked(assessments: List[Dict]) -> bool:    """    While creating the guardrail you could specify multiple types of policies.    At the time of assessment all the policies should be checked for potential violations    If there is even 1 violation that blocks the request, the entire request is blocked    This method checks if the policy assessment is blocked based on the given assessments.    Args:        assessments (list[dict]): A list of assessment dictionaries, where each dictionary may contain 'topicPolicy', 'wordPolicy', 'sensitiveInformationPolicy', and 'contentPolicy' keys.    Returns:        bool: True if the policy assessment is blocked, False otherwise.    """    blocked = []    for assessment in assessments:        if 'topicPolicy' in assessment:            blocked.append(check_severe_violations(assessment['topicPolicy']['topics']))        if 'wordPolicy' in assessment:            if 'customWords' in assessment['wordPolicy']:                blocked.append(check_severe_violations(assessment['wordPolicy']['customWords']))            if 'managedWordLists' in assessment['wordPolicy']:                blocked.append(check_severe_violations(assessment['wordPolicy']['managedWordLists']))        if 'sensitiveInformationPolicy' in assessment:            if 'piiEntities' in assessment['sensitiveInformationPolicy']:                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['piiEntities']))            if 'regexes' in assessment['sensitiveInformationPolicy']:                blocked.append(check_severe_violations(assessment['sensitiveInformationPolicy']['regexes']))        if 'contentPolicy' in assessment:            blocked.append(check_severe_violations(assessment['contentPolicy']['filters']))    severe_violation_count = sum(blocked)    print(f'::Guardrail:: {severe_violation_count} severe violations detected')    return severe_violation_count>0

We can then define how to apply guardrail. If the response from the API leads to an action == 'GUARDRAIL_INTERVENED', it means that the guardrail has detected a potential violation. We now need to check if the violation was severe enough to block the request or pass it through with either the same text as input or an alternate text in which modifications are made according to the defined policies:

def apply_guardrail(text, source, guardrail_id, guardrail_version):    response = bedrock_runtime.apply_guardrail(        guardrailIdentifier=guardrail_id,        guardrailVersion=guardrail_version,         source=source,        content=[{"text": {"text": text}}]    )    if response['action'] == 'GUARDRAIL_INTERVENED':        is_blocked = is_policy_assessement_blocked(response['assessments'])        alternate_text = ' '.join([output['text'] for output in response['output']])        return is_blocked, alternate_text, response    else:        # Return the default response in case of no guardrail intervention        return False, text, response

Let’s now apply our strategy for streaming output from an LLM. We can maintain a buffer_text, which creates a batch of chunks received from the stream. As soon as len(buffer_text + new_text) > TEXT_UNIT, meaning if the batch is close to a text unit (1,000 characters), it’s ready to be sent to the ApplyGuardrail API. With this mechanism, we can make sure we don’t incur the unnecessary cost of invoking the API on smaller chunks and also that enough context is available inside each batch for the guardrail to make meaningful assessments. Additionally, when the generation is complete from the LLM, the final batch must also be tested for potential violations. If at any point the API detects severe violations, further consumption of the stream is halted and the user is displayed the preset message at the time of creation of the guardrail.

In the following example, we ask the LLM to generate three names and tell us what is a bank. This generation will lead to GUARDRAIL_INTERVENED but not block the generation, and instead anonymize the text (masking the names) and continue with generation.

input_message = "List 3 names of prominent CEOs and later tell me what is a bank and what are the benefits of opening a savings account?"model_id = "anthropic.claude-3-haiku-20240307-v1:0"text_unit= 1000 # charactersresponse = bedrock_runtime.converse_stream(    modelId=model_id,    messages=[{        "role": "user",        "content": [{"text": input_message}]    system=[{"text" : "You are an assistant that helps with tasks from users. Be as elaborate as possible"}],)stream = response.get('stream')buffer_text = ""if stream:    for event in stream:        if 'contentBlockDelta' in event:            new_text = event['contentBlockDelta']['delta']['text']            if len(buffer_text + new_text) > text_unit:                is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)                # print(alt_text, end="")                if is_blocked:                    break                buffer_text = new_text            else:                 buffer_text += new_text        if 'messageStop' in event:            # print(f"\nStop reason: {event['messageStop']['stopReason']}")            is_blocked, alt_text, guardrail_response = apply_guardrail(buffer_text, "OUTPUT", guardrail_id, guardrail_version)            # print(alt_text)

After running the preceding code, we receive an example output with masked names:

Certainly! Here are three names of prominent CEOs:1. {NAME} - CEO of Apple Inc.2. {NAME} - CEO of Microsoft Corporation3. {NAME} - CEO of AmazonNow, let's discuss what a bank is and the benefits of opening a savings account.A bank is a financial institution that accepts deposits, provides loans, and offers various other financial services to its customers. Banks play a crucial role in the economy by facilitating the flow of money and enabling financial transactions.

Long-context inputs

RAG is a technique that enhances LLMs by incorporating external knowledge sources. It allows LLMs to reference authoritative knowledge bases before generating responses, producing output tailored to specific contexts while providing relevance, accuracy, and efficiency. The input to the LLM in a RAG scenario can be quite long, because it includes the user’s query concatenated with the retrieved information from the knowledge base. This long-context input poses challenges when applying guardrails, because the input may exceed the character limits imposed by the ApplyGuardrail API. To learn more about the quotas applied to Guardrails for Amazon Bedrock, refer to Guardrails quotas.

We evaluated the strategy to avoid the risk from model response in the previous section. In the case of inputs, the risk could be both at the query level or together with the query and the retrieved context for the query.

The retrieved information from the knowledge base may contain sensitive or potentially harmful content, which needs to be identified and handled appropriately, for example masking sensitive information, before being passed to the LLM for generation. Therefore, guardrails must be applied to the entire input to make sure it adheres to the defined policies and constraints.

Solution overview: Use guardrails on long-context inputs

The ApplyGuardrail API has a default limit of 25 text units (approximately 25,000 characters) per second. If the input exceeds this limit, it needs to be chunked and processed sequentially to avoid throttling. Therefore, the strategy becomes relatively straightforward: if the length of input text is less than 25 text units (25,000 characters), then it can be evaluated in a single request, otherwise it needs to be broken down into smaller pieces. The chunk size can vary depending on application behavior and the type of context in the application; you can start with 12 text units and iterate to find the best suitable chunk size. This way, we maximize the allowed default limit while keeping most of the context intact in a single request. Even if the guardrail action is GUARDRAIL_INTERVENED, it doesn’t mean the input is BLOCKED. It could also be true that the input is processed and sensitive information is masked; in this case, the input text must be recompiled with any processed response from the applied guardrail.

text_unit = 1000 # characterslimit_text_unit = 25max_text_units_in_chunk = 12def apply_guardrail_with_chunking(text, guardrail_id, guardrail_version="DRAFT"):    text_length = len(text)    filtered_text = ''    if text_length <= limit_text_unit * text_unit:        return apply_guardrail(text, "INPUT", guardrail_id, guardrail_version)    else:        # If the text length is greater than the default text unit limits then it's better to chunk the text to avoid throttling.        for i, chunk in enumerate(wrap(text, max_text_units_in_chunk * text_unit)):            print(f'::Guardrail::Applying guardrails at chunk {i+1}')            is_blocked, alternate_text, response = apply_guardrail(chunk, "INPUT", guardrail_id, guardrail_version)            if is_blocked:                filtered_text = alternate_text                break            # It could be the case that guardrails intervened and anonymized PII in the input text,            # we can then take the output from guardrails to create filtered text response.            filtered_text += alternate_text        return is_blocked, filtered_text, response

Run the full notebook to test this strategy with long-context input.

Best practices and considerations

When applying guardrails, it’s essential to follow best practices to maintain efficient and effective content moderation:

In addition to the aforementioned considerations, it’s also good practice to regularly audit your guardrail implementation, continuously refine and adapt your guardrail implementation, and implement logging and monitoring mechanisms to capture and analyze the performance and effectiveness of your guardrails.

Clean up

The only resource we created in this example is a guardrail. To delete the guardrail, complete the following steps:

    On the Amazon Bedrock console, under Safeguards in the navigation pane, choose Guardrails. Select the guardrail you created and choose Delete.

Alternatively, you can use the SDK:

bedrock_client.delete_guardrail(guardrailIdentifier = "<your_guardrail_id>")

Key takeaways

Applying guardrails is crucial for maintaining responsible and safe content generation. With the ApplyGuardrail API from Amazon Bedrock, you can effectively moderate both inputs and outputs, protecting your generative AI application against violations and maintaining compliance with your content policies.

Key takeaways from this post include:

Benefits

By incorporating the ApplyGuardrail API into your generative AI application, you can unlock several benefits:

Conclusion

By using the ApplyGuardrail API from Amazon Bedrock and following the best practices outlined in this post, you can make sure your generative AI application remains safe, responsible, and compliant with content moderation standards, even with long-context inputs and streaming outputs.

To further explore the capabilities of the ApplyGuardrail API and its integration with your generative AI application, consider experimenting with the API using the following resources:

Resources

The following resources explain both practical and ethical aspects of applying Guardrails for Amazon Bedrock:


About the Author

Talha Chattha is a Generative AI Specialist Solutions Architect at Amazon Web Services, based in Stockholm. Talha helps establish practices to ease the path to production for Gen AI workloads. Talha is an expert in Amazon Bedrock and supports customers across entire EMEA. He holds passion about meta-agents, scalable on-demand inference, advanced RAG solutions and cost optimized prompt engineering with LLMs. When not shaping the future of AI, he explores the scenic European landscapes and delicious cuisines. Connect with Talha at LinkedIn using /in/talha-chattha/.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Bedrock 生成式 AI 防护措施 ApplyGuardrail API 内容安全 流式输出
相关文章