Intelligent document processing using Amazon Bedrock and Anthropic Claude

Generative artificial intelligence (AI) not only empowers innovation through ideation, content creation, and enhanced customer service, but also streamlines operations and boosts productivity across various domains. To effectively harness this transformative technology, Amazon Bedrock offers a fully managed service that integrates high-performing foundation models (FMs) from leading AI companies, such as AI21 Labs, Anthropic, Cohere, Meta, Stability AI, Mistral AI, and Amazon. By providing access to these advanced models through a single API and supporting the development of generative AI applications with an emphasis on security, privacy, and responsible AI, Amazon Bedrock enables you to use AI to explore new avenues for innovation and improve overall offerings.

Enterprise customers can unlock significant value by harnessing the power of intelligent document processing (IDP) augmented with generative AI. By infusing IDP solutions with generative AI capabilities, organizations can revolutionize their document processing workflows, achieving exceptional levels of automation and reliability. This combination enables advanced document understanding, highly effective structured data extraction, automated document classification, and seamless information retrieval from unstructured text. With these capabilities, organizations can achieve scalable, efficient, and high-value document processing that drives business transformation and competitiveness, ultimately leading to improved productivity, reduced costs, and enhanced decision-making.

In this post, we show how to develop an IDP solution using Anthropic Claude 3 Sonnet on Amazon Bedrock. We demonstrate how to extract data from a scanned document and insert it into a database.

The Anthropic Claude 3 Sonnet model is optimized for speed and efficiency, making it an excellent choice for intelligent tasks—particularly for enterprise workloads. It also possesses sophisticated vision capabilities, demonstrating a strong aptitude for understanding a wide range of visual formats, including photos, charts, graphs, and technical diagrams. Although we demonstrate this solution using the Anthropic Claude 3 Sonnet model, you can alternatively use the Haiku and Opus models if your use case requires them.

Solution overview

The proposed solution uses Amazon Bedrock and the powerful Anthropic Claude 3 Sonnet model to enable IDP capabilities. The architecture consists of several AWS services seamlessly integrated with the Amazon Bedrock, enabling efficient and accurate extraction of data from scanned documents.

The following diagram illustrates our solution architecture.

The solution consists of the following steps:

Amazon Simple Storage Service

AWS Lambda

Amazon Simple Queue Service

Amazon DynamoDB

This serverless architecture takes advantage of the scalability and cost-effectiveness of AWS services while harnessing the cutting-edge intelligence of Anthropic Claude 3 Sonnet. By combining the robust infrastructure of AWS with Anthropic’s FMs, this solution enables organizations to streamline their document processing workflows, extract valuable insights, and enhance overall operational efficiency.

The solution uses the following services and features:

Amazon Bedrock

Anthropic Claude 3 family

In this solution, we use the generative AI capabilities in Amazon Bedrock to efficiently extract data. As of writing of this post, Anthropic Claude 3 Sonnet only accepts images as input. The supported file types are GIF, JPEG, PNG, and WebP. You can choose to save images during the scanning process or convert the PDF to images.

You can also enhance this solution by implementing human-in-the-loop and model evaluation features. The goal of this post is to demonstrate how you can build an IDP solution using Amazon Bedrock, but to use this as a production-scale solution, additional considerations should be taken into account, such as testing for edge case scenarios, better exception handling, trying additional prompting techniques, model fine-tuning, model evaluation, throughput requirements, number of concurrent requests to be supported, and carefully considering cost and latency implications.

Prerequisites

You need the following prerequisites before you can proceed with this solution. For this post, we use the us-east-1 AWS Region. For details on available Regions, see Amazon Bedrock endpoints and quotas.

AWS account

AWS Identity and Access Management

Manage model access

Use case and dataset

For our example use case, let’s look at a state agency responsible for issuing birth certificates. The agency may receive birth certificate applications through various methods, such as online applications, forms completed at a physical location, and mailed-in completed paper applications. Today, most agencies spend a considerable amount of time and resources to manually extract the application details. The process begins with scanning the application forms, manually extracting the details, and then entering them into an application that eventually stores the data into a database. This process is time-consuming, inefficient, not scalable, and error-prone. Additionally, it adds complexity if the application form is in a different language (such as Spanish).

For this demonstration, we use sample scanned images of birth certificate application forms. These forms don’t contain any real personal data. Two examples are provided: one in English (handwritten) and another in Spanish (printed). Save these images as .jpeg files to your computer. You need them later for testing the solution.

english-hand-written.jpeg

spanish-printed.jpeg

Create an S3 bucket

On the Amazon S3 console, create a new bucket with a unique name (for example, bedrock-claude3-idp-{random characters to make it globally unique}) and leave the other settings as default. Within the bucket, create a folder named images and a sub-folder named birth_certificates.

Create an SQS queue

On the Amazon SQS console, create a queue with the Standard queue type, provide a name (for example, bedrock-idp-extracted-data), and leave the other settings as default.

Create a Lambda function to invoke the Amazon Bedrock model

On the Lambda console, create a function (for example, invoke_bedrock_claude3), choose Python 3.12 for the runtime, and leave the remaining settings as default. Later, you configure this function to be invoked every time a new image is uploaded into the S3 bucket. You can download the entire Lambda function code from invoke_bedrock_claude3.py. Replace the contents of the lambda_function.py file with the code from the downloaded file. Make sure to substitute {SQS URL} with the URL of the SQS queue you created earlier, then choose Deploy.

The Lambda function should perform the following actions:

s3 = boto3.client('s3')sqs = boto3.client('sqs')bedrock = boto3.client('bedrock-runtime', region_name='us-east-1')QUEUE_URL = {SQS URL}MODEL_ID = "anthropic.claude-3-sonnet-20240229-v1:0"

The following code gets the image from the S3 bucket using the get_object method and converts it to base64 data:

image_data = s3.get_object(Bucket=bucket_name, Key=object_key)['Body'].read()base64_image = base64.b64encode(image_data).decode('utf-8')

Prompt engineering is a critical factor in unlocking the full potential of generative AI applications like IDP. Crafting well-structured prompts makes sure that the AI system’s outputs are accurate, relevant, and aligned with your objectives, while mitigating potential risks.

With the Anthropic Claude 3 model integrated into the Amazon Bedrock IDP solution, you can use the model’s impressive visual understanding capabilities to effortlessly extract data from documents. Simply provide the image or document as input, and Anthropic Claude 3 will comprehend its contents, seamlessly extracting the desired information and presenting it in a human-readable format. All Anthropic Claude 3 models are capable of understanding non-English languages such as Spanish, Japanese, and French. In this particular use case, we demonstrate how to translate Spanish application forms into English by providing the appropriate prompt instructions.

However, LLMs like Anthropic Claude 3 can exhibit variability in their response formats. To achieve consistent and structured output, you can tailor your prompts to instruct the model to return the extracted data in a specific format, such as JSON with predefined keys. This approach enhances the interoperability of the model’s output with downstream applications and streamlines data processing workflows.

The following is the prompt with the specific JSON output format:

prompt = """This image shows a birth certificate application form. Please precisely copy all the relevant information from the form.Leave the field blank if there is no information in corresponding field.If the image is not a birth certificate application form, simply return an empty JSON object. If the application form is not filled, leave the fees attributes blank. Translate any non-English text to English. Organize and return the extracted data in a JSON format with the following keys:{    "applicantDetails":{        "applicantName": "",        "dayPhoneNumber": "",        "address": "",        "city": "",        "state": "",        "zipCode": "",        "email":""    },    "mailingAddress":{        "mailingAddressApplicantName": "",        "mailingAddress": "",        "mailingAddressCity": "",        "mailingAddressState": "",        "mailingAddressZipCode": ""    },    "relationToApplicant":[""],    "purposeOfRequest": "",        "BirthCertificateDetails":    {        "nameOnBirthCertificate": "",        "dateOfBirth": "",        "sex": "",        "cityOfBirth": "",        "countyOfBirth": "",        "mothersMaidenName": "",        "fathersName": "",        "mothersPlaceOfBirth": "",        "fathersPlaceOfBirth": "",        "parentsMarriedAtBirth": "",        "numberOfChildrenBornInSCToMother": "",        "diffNameAtBirth":""    },    "fees":{        "searchFee": "",        "eachAdditionalCopy": "",        "expediteFee": "",        "totalFees": ""    }   }"""

Invoke the Anthropic Claude 3 Sonnet model using the Amazon Bedrock API. Pass the prompt and the base64 image data as parameters:

def invoke_claude_3_multimodal(prompt, base64_image_data):    request_body = {        "anthropic_version": "bedrock-2023-05-31",        "max_tokens": 2048,        "messages": [            {                "role": "user",                "content": [                    {                        "type": "text",                        "text": prompt,                    },                    {                        "type": "image",                        "source": {                            "type": "base64",                            "media_type": "image/png",                            "data": base64_image_data,                        },                    },                ],            }        ],    }    try:        response = bedrock.invoke_model(modelId=MODEL_ID, body=json.dumps(request_body))        return json.loads(response['body'].read())    except bedrock.exceptions.ClientError as err:        print(f"Couldn't invoke Claude 3 Sonnet. Here's why: {err.response['Error']['Code']}: {err.response['Error']['Message']}")        raise

Send the Amazon Bedrock API response to the SQS queue using the send_message method:

def send_message_to_sqs(message_body):    try:        sqs.send_message(QueueUrl=QUEUE_URL, MessageBody=json.dumps(message_body))    except sqs.exceptions.ClientError as e:        print(f"Error sending message to SQS: {e.response['Error']['Code']}: {e.response['Error']['Message']}")

Next, modify the IAM role of the Lambda function to grant the required permissions:

Configuration

Permissions

invoke_bedrock_claude3-role-{random chars}

This will open the role on a new tab.

Permissions policies

Add permissions

Create inline policy

Create policy

JSON

{AWS Account ID}

{S3 Bucket Name}

Next

invoke_bedrock_claude3-role-policy

Create policy

{    "Version": "2012-10-17",    "Statement": [{        "Effect": "Allow",        "Action": "bedrock:InvokeModel",        "Resource": "arn:aws:bedrock:us-east-1::foundation-model/*"    }, {        "Effect": "Allow",        "Action": "s3:GetObject",        "Resource": "arn:aws:s3:::{S3 Bucket Name}/*"    }, {        "Effect": "Allow",        "Action": "sqs:SendMessage",        "Resource": "arn:aws:sqs:us-east-1:{AWS Account ID}:bedrock-idp-extracted-data"    }]}

The policy will grant the following permissions:

bedrock-claude3-idp...

bedrock-idp-extracted-data

Additionally, modify the Lambda function’s timeout to 2 minutes. By default, it’s set to 3 seconds.

Create an S3 Event Notification

To create an S3 Event Notification, complete the following steps:

bedrock-claude3-idp...

Properties

Event notifications

Event name

bedrock-claude3-idp-event-notification

images/birth_certificates/

Event Type

Put

Object creation

Destination

Lambda function

invoke_bedrock_claude3

Save changes

Create a DynamoDB table

To store the extracted data in DynamoDB, you need to create a table. On the DynamoDB console, create a table called birth_certificates with Id as the partition key, and keep the remaining settings as default.

Create a Lambda function to insert records into the DynamoDB table

On the Lambda console, create a Lambda function (for example, insert_into_dynamodb), choose Python 3.12 for the runtime, and leave the remaining settings as default. You can download the entire Lambda function code from insert_into_dynamodb.py. Replace the contents of the lambda_function.py file with the code from the downloaded file and choose Deploy.

The Lambda function should perform the following actions:

Get the message from the SQS queue that contains the response from the Anthropic Claude 3 Sonnet model:

data = json.loads(event['Records'][0]['body'])['content'][0]['text']event_id = event['Records'][0]['messageId']data = json.loads(data)

Create objects representing DynamoDB and its table:

dynamodb = boto3.resource('dynamodb')table = dynamodb.Table('birth_certificates')

Get the key objects from the JSON data:

applicant_details = data.get('applicantDetails', {})    mailing_address = data.get('mailingAddress', {})    relation_to_applicant = data.get('relationToApplicant', [])    birth_certificate_details = data.get('BirthCertificateDetails', {})    fees = data.get('fees', {})

Insert the extracted data into DynamoDB table using put_item() method:

table.put_item(Item={'Id': event_id,'applicantName': applicant_details.get('applicantName', ''),'dayPhoneNumber': applicant_details.get('dayPhoneNumber', ''),'address': applicant_details.get('address', ''),'city': applicant_details.get('city', ''),'state': applicant_details.get('state', ''),'zipCode': applicant_details.get('zipCode', ''),'email': applicant_details.get('email', ''),'mailingAddressApplicantName': mailing_address.get('mailingAddressApplicantName', ''),'mailingAddress': mailing_address.get('mailingAddress', ''),'mailingAddressCity': mailing_address.get('mailingAddressCity', ''),'mailingAddressState': mailing_address.get('mailingAddressState', ''),'mailingAddressZipCode': mailing_address.get('mailingAddressZipCode', ''),'relationToApplicant': ', '.join(relation_to_applicant),'purposeOfRequest': data.get('purposeOfRequest', ''),'nameOnBirthCertificate': birth_certificate_details.get('nameOnBirthCertificate', ''),'dateOfBirth': birth_certificate_details.get('dateOfBirth', ''),'sex': birth_certificate_details.get('sex', ''),'cityOfBirth': birth_certificate_details.get('cityOfBirth', ''),'countyOfBirth': birth_certificate_details.get('countyOfBirth', ''),'mothersMaidenName': birth_certificate_details.get('mothersMaidenName', ''),'fathersName': birth_certificate_details.get('fathersName', ''),'mothersPlaceOfBirth': birth_certificate_details.get('mothersPlaceOfBirth', ''),'fathersPlaceOfBirth': birth_certificate_details.get('fathersPlaceOfBirth', ''),'parentsMarriedAtBirth': birth_certificate_details.get('parentsMarriedAtBirth', ''),'numberOfChildrenBornInSCToMother': birth_certificate_details.get('numberOfChildrenBornInSCToMother', ''),'diffNameAtBirth': birth_certificate_details.get('diffNameAtBirth', ''),'searchFee': fees.get('searchFee', ''),'eachAdditionalCopy': fees.get('eachAdditionalCopy', ''),'expediteFee': fees.get('expediteFee', ''),'totalFees': fees.get('totalFees', '')})

Next, modify the IAM role of the Lambda function to grant the required permissions. Follow the same steps you used to modify the permissions for the invoke_bedrock_claude3 Lambda function, but enter the following JSON as the inline policy:

{    "Version": "2012-10-17",    "Statement": [        {            "Sid": "VisualEditor0",            "Effect": "Allow",            "Action": "dynamodb:PutItem",            "Resource": "arn:aws:dynamodb:us-east-1::{AWS Account ID}:table/birth_certificates"        },        {            "Sid": "VisualEditor1",            "Effect": "Allow",            "Action": [                "sqs:DeleteMessage",                "sqs:ReceiveMessage",                "sqs:GetQueueAttributes"            ],            "Resource": "arn:aws:sqs:us-east-1::{AWS Account ID}:bedrock-idp-extracted-data"        }    ]}

Enter a policy name (for example, insert_into_dynamodb-role-policy) and choose Create policy.

The policy will grant the following permissions:

Put records into the DynamoDB table Read and delete messages from the SQS queue

Configure the Lambda function trigger for SQS

Complete the following steps to create a trigger for the Lambda function:

bedrock-idp-extracted-data

Lambda triggers

Configure Lambda function trigger

insert_into_dynamodb

Save

Test the solution

Now that you have created all the necessary resources, permissions, and code, it’s time to test the solution.

In the S3 folder birth_certificates, upload the two scanned images that you downloaded earlier. Then open the DynamoDB console and explore the items in the birth_certificates table.

If everything is configured properly, you should see two items in DynamoDB in just a few seconds, as shown in the following screenshots. For the Spanish form, Anthropic Claude 3 automatically translated the keys and labels from Spanish to English based on the prompt.

Troubleshooting

If you don’t see the extracted data in the DynamoDB table, you can investigate the issue:

Check CloudWatch logs

Amazon CloudWatch

Identify missing permissions

Implement a dead-letter queue

dead letter queue

Clean up

Clean up the resources created as part of this post to avoid incurring ongoing charges:

bedrock-claude3-idp...

invoke_bedrock_claude3

insert_into_dynamodb

bedrock-idp-extracted-data

birth_certificates

Example use cases and business value

The generative AI-powered IDP solution demonstrated in this post can benefit organizations across various industries, such as:

Government and public sector

Healthcare

Finance and banking

Logistics and supply chain

Retail and ecommerce

By using the power of generative AI and Amazon Bedrock, organizations can unlock the true potential of their data, driving operational excellence, enhancing customer experiences, and fostering continuous innovation.

Conclusion

In this post, we demonstrated how to use Amazon Bedrock and the powerful Anthropic Claude 3 Sonnet model to develop an IDP solution. By harnessing the advanced multimodal capabilities of Anthropic Claude 3, we were able to accurately extract data from scanned documents and store it in a structured format in a DynamoDB table.

Although this solution showcases the potential of generative AI in IDP, it may not be suitable for all IDP use cases. The effectiveness of the solution may vary depending on the complexity and quality of the documents, the amount of training data available, and the specific requirements of the organization.

To further enhance the solution, consider implementing a human-in-the-loop workflow to review and validate the extracted data, especially for mission-critical or sensitive applications. This will provide data accuracy and compliance with regulatory requirements. You can also explore the model evaluation feature in Amazon Bedrock to compare model outputs, and then choose the model best suited for your downstream generative AI applications.

For further exploration and learning, we recommend checking out the following resources:

Amazon Bedrock Developer Guide

Anthropic’s Claude 3 Opus model is now available on Amazon Bedrock

Anthropic Claude 3

About the Authors

Govind Palanisamy is a Solutions Architect at AWS, where he helps government agencies migrate and modernize their workloads to increase citizen experience. He is passionate about technology and transformation, and he helps customers transform their businesses using AI/ML and generative AI-based solutions.

Bharath Gunapati is a Sr. Solutions architect at AWS, where he helps clinicians, researchers, and staff at academic medical centers to adopt and use cloud technologies. He is passionate about technology and the impact it can make on healthcare and research.

Solution overview

Prerequisites

Use case and dataset

Create an S3 bucket

Create an SQS queue

Create a Lambda function to invoke the Amazon Bedrock model

Create an S3 Event Notification

Create a DynamoDB table

Create a Lambda function to insert records into the DynamoDB table

Configure the Lambda function trigger for SQS

Test the solution

Troubleshooting

Clean up

Example use cases and business value

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签