Structured outputs with Amazon Nova: A guide for builders

Developers building AI applications face a common challenge: converting unstructured data into structured formats. Structured output is critical for machine-to-machine communication use cases, because this enables downstream use cases to more effectively consume and process the generated outputs. Whether it’s extracting information from documents, creating assistants that fetch data from APIs, or developing agents that take actions, these tasks require foundation models to generate outputs in specific structured formats.

We launched constrained decoding to provide reliability when using tools for structured outputs. Now, tools can be used with Amazon Nova foundation models (FMs) to extract data based on complex schemas, reducing tool use errors by over 95%.

In this post, we explore how you can use Amazon Nova FMs for structured output use cases.

Techniques for implementing structured outputs

When addressing the requirements for structured outputs use cases, there are two common approaches for implementation. You can modify the system prompt or take advantage of tool calling. For example, in a customer support use case, you might want the model to output a JSON with its response to the user and the current sentiment. So, the system prompt would be modified to include the expected structure:

Make sure your final response is valid JSON that follows the below response schema: ##Response schema```json{   "response": "the response to the customer",   "sentiment": "the current customer sentiment"}```

The other option is to provide a tool configuration. Tool calling is the act of providing an API, code function, or schema (or structure) required by your end application to the model through the request schema with the Converse API. This is most used when building agentic applications but is also frequently used in structured output use cases because of the ability to define a set schema that the model should adhere to.

tool_config = {    "tools": [         {            "toolSpec": {                "name": "respondToUser",                "description": "the formatted response to the customer",                "inputSchema": {                    "type": "object",                    "properties": {                        "response": {                            "description": "the response to the customer",                            "type": "string"                        },                        "sentiment": {                            "description": "the current customer sentiment",                            "type": "string"                        }                    },                    "required": [                        "response",                        "sentiment"                    ]                }            }        }    ]}

Both approaches can be effective prompting techniques to influence the model output. However, the output is still non-deterministic and there is room for failure. In our work with customers to implement use cases such as agentic workflows and applications and structured extraction, we’ve observed that the accuracy of the model tends to decrease as the schema becomes more complex.

Structured output with Amazon Nova models

Based on these learnings, we have implemented constrained decoding in our system to help ensure high model reliability in the output generated and to allow the model to handle complex schemas with ease. Constrained decoding relies on a grammar to constrain the possible tokens a model can output at each step. This is differentiated from the prompting techniques historically used, because this changes the actual tokens a model can choose from when generating an output. For example, when closing a JSON object, the model would be constrained to just a } token to select. Constrained decoding is used every time a tool configuration is passed. Because tool use provides us a specific schema already, we can use that to generate a grammar dynamically, based on the schema desired by the developer. Constrained decoding prevents the model from generating invalid keys and enforces correct data types based on the defined schema.

Schema definition process

A key step in using structured outputs with Amazon Nova is to create a tool configuration. The tool configuration provides a standard interface to define the expected output schema. While the primary intent of a tool configuration is to provide external functionality to the model, this JSON interface is used in structured output use cases as well. This can be illustrated using a use case that extracts recipes from online content. To start the integration, we create a tool configuration representing the specific fields we want extracted from the invoices. When creating a tool configuration, it is important to be clear and concise because the property names and descriptions are what inform the model how the fields should be populated.

tool_config = {   "tools": [        {            "toolSpec": {                "name": "extract_recipe",                "description": "Extract recipe for cooking instructions",                "inputSchema": {                    "json": {                        "type": "object",                        "properties": {                            "recipe": {                                "type": "object",                                "properties": {                                    "name": {                                        "type": "string",                                        "description": "Name of the recipe"                                    },                                    "description": {                                        "type": "string",                                        "description": "Brief description of the dish"                                    },                                    "prep_time": {                                        "type": "integer",                                        "description": "Preparation time in minutes"                                    },                                    "cook_time": {                                        "type": "integer",                                        "description": "Cooking time in minutes"                                    },                                    "servings": {                                        "type": "integer",                                        "description": "Number of servings"                                    },                                    "difficulty": {                                        "type": "string",                                        "enum": [                                            "easy",                                            "medium",                                            "hard"                                        ],                                        "description": "Difficulty level of the recipe"                                    },                                    "ingredients": {                                        "type": "array",                                        "items": {                                            "type": "object",                                            "properties": {                                                "name": {                                                    "type": "string",                                                    "description": "Name of ingredient"                                                },                                                "amount": {                                                    "type": "number",                                                    "description": "Quantity of ingredient"                                                },                                                "unit": {                                                    "type": "string",                                                    "description": "Unit of measurement"                                                }                                            },                                            "required": [                                                "name",                                                "amount",                                                "unit"                                            ]                                        }                                    },                                    "instructions": {                                        "type": "array",                                        "items": {                                            "type": "string",                                            "description": "Step-by-step cooking instructions"                                        }                                    },                                    "tags": {                                        "type": "array",                                        "items": {                                            "type": "string",                                            "description": "Categories or labels for the recipe"                                        }                                    }                                },                                "required": [                               ]                            }                        },                        "required": [                        ]                    }                }            }        }    ]}

After the tool configuration has been created, we can pass it through the Converse API along with the recipe, which will be contained in the user prompt. A system prompt is historically required for structured output use cases to guide the model in how to output the content, in this case we can use it to pass details about the system role and persona.

import boto3model_response = client.converse(    modelId="us.amazon.nova-lite-v1:0",   system=[{"text": "You are an expert recipe extractor that compiles recipe details from blog posts"}],    messages=[{"role": "user", "content": content}],    inferenceConfig={"temperature": 0},    toolConfig=tool_config)

By using the native tool use support with constrained decoding, we get a parsed tool call that will follow the correct syntax and expected schema as set in the tool configuration.

{    "toolUse": {        "toolUseId": "tooluse_HDCl-Y8gRa6yWTU-eE97xg",        "name": "extract_recipe",        "input": {            "recipe": {                "name": "Piacenza Tortelli",                "description": "Piacenza tortelli, also known as 'tortelli with the tail' due to their elongated shape, are a delicious fresh pasta, easy to make at home!",                "prep_time": 60,                "cook_time": 10,                "servings": 4,                "difficulty": "hard",                "ingredients": [                    {                        "name": "Type 00 flour",                        "amount": 2.3,                        "unit": "cups"                    },                    {                        "name": "Eggs",                        "amount": 3,                        "unit": ""                    },                    {                        "name": "Fine salt",                        "amount": 1,                        "unit": "pinch"                    },                    {                        "name": "Spinach",                        "amount": 13.3,                        "unit": "cups"                    },                    {                        "name": "Cow's milk ricotta cheese",                        "amount": 1.3,                        "unit": "cups"                    },                    {                        "name": "Parmigiano Reggiano PDO cheese",                        "amount": 4.2,                        "unit": "oz"                    },                    {                        "name": "Fine salt",                        "amount": 1,                        "unit": "to taste"                    },                    {                        "name": "Nutmeg",                        "amount": 1,                        "unit": "to taste"                    },                    {                        "name": "Butter",                        "amount": 80,                        "unit": "g"                    },                    {                        "name": "Sage",                        "amount": 2,                        "unit": "sprigs"                    }                ],                "instructions": [                    "Arrange the flour in a mound and pour the eggs into the center 1; add a pinch of salt and start working with a fork 2, then knead by hand 3.",                    "You should obtain a smooth dough 4; wrap it in plastic wrap and let it rest for half an hour in a cool place.",                    "Meanwhile, prepare the filling starting with the spinach: immerse them in boiling salted water 5 and blanch them for a few minutes until wilted 6.",                    "Drain the spinach and transfer them to cold water 7, preferably with ice. Then squeeze them very well 8 and chop them finely with a knife 9.",                    "Place the chopped spinach in a bowl, add the ricotta 10, salt, pepper, and nutmeg 11. Also add the grated Parmigiano Reggiano DOP 12.",                    "Mix well until you get a homogeneous consistency 13.",                    "At this point, take the dough that has now rested 14, take a portion of it keeping the rest covered. Lightly flatten the dough with a rolling pin 15.",                    "Roll it out with a pasta machine 16; as you reduce the thickness, fold the dough over itself 17 and roll it out again 18.",                    "You should get a very thin rectangle, about 0.04-0.08 inches thick 19. Cut 2 strips of dough by dividing the rectangle in half lengthwise 20, then cut out diamonds of 4 inches 21.",                    "Fill the diamonds with the spinach filling 22 and close them. To do this, bring one of the two longer points inward 23, then fold the two side points towards the center 24.",                    "Now close the tortello by pinching the dough in the center and moving gradually towards the outside 25. The movement is similar to the closure of culurgiones. Continue in this way until the dough and filling are finished 26; you will get about 40-45 pieces.",                    "Place a pot full of salted water on the stove. Meanwhile, in a pan, pour the butter and sage 27. Turn on the heat and let it flavor.",                    "Then cook the tortelli for 5-6 minutes 28, then drain them and toss them in the butter and sage sauce 29.",                    "Plate and serve the Piacenza tortelli with plenty of grated Parmigiano Reggiano DOP 30!"                ],                "tags": [                    "vegetarian",                    "Italian"                ]            }        }    }}

Now, with constrained decoding, we can use a smaller model such as Amazon Nova Lite to output a large and complex JSON schema to use in our application. For image-based use cases with complex schemas, we recommend that you use Nova Pro or Nova Premier for the best performance.

Conclusion

By using structured output with Amazon Nova through tool calling, you can take advantage of the key benefits of constrained decoding and build a reliable system. We encourage you to try this out in your applications today. Learn more at the Amazon Nova User Guide. Get started building your AI applications with Amazon Nova in the Amazon Bedrock console.

About the authors

Jean Farmer is a Generative AI Solutions Architect on the Amazon Artificial General Intelligence (AGI) team, specializing in agentic applications. Based in Seattle, Washington, she works at the intersection of autonomous AI systems and practical business solutions, helping to shape the future of AGI at Amazon.

Mukund Birje is a Sr. Product Marketing Manager on the AIML team at AWS. In his current role he’s focused on driving adoption of Amazon Nova Foundation Models. He has over 10 years of experience in marketing and branding across a variety of industries. Outside of work you can find him hiking, reading, and trying out new restaurants. You can connect with him on LinkedIn.

Techniques for implementing structured outputs

Structured output with Amazon Nova models

Schema definition process

Conclusion

About the authors

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签