AWS Machine Learning Blog 前天 00:48
Structured outputs with Amazon Nova: A guide for builders
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

开发者在构建 AI 应用时常面临将非结构化数据转换为结构化格式的挑战,而结构化输出对于机器间通信至关重要。本文介绍了如何利用 Amazon Nova 基础模型(FMs)和约束解码技术,有效解决这一问题。通过工具调用和精确的模式定义,Amazon Nova 能够处理复杂的模式,并将工具使用错误减少超过 95%。无论是提取文档信息、构建数据获取助手,还是开发执行操作的代理,Amazon Nova 都能确保输出的可靠性和准确性,即使是较小的模型也能生成复杂的 JSON 模式,极大地提升了 AI 应用的效能。

💡 **结构化输出是 AI 应用的关键**:在机器通信和下游应用处理中,将非结构化数据转换为结构化格式是实现高效数据消费和处理的基础,尤其在信息提取、API 数据获取和代理执行操作等场景下尤为重要。

🛠️ **约束解码提升输出可靠性**:Amazon Nova 引入了约束解码技术,通过依赖语法规则限制模型每一步的 token 选择,从而确保输出的可靠性。这与传统的提示工程不同,它直接影响模型生成过程,例如在闭合 JSON 对象时,模型只能选择 `}` token。

🔗 **工具调用与模式定义**:通过提供 API、代码函数或模式(即工具配置),模型可以遵循开发者定义的特定结构。这种方式常用于构建代理应用,也能有效支持结构化输出用例,因为它可以明确定义模型应遵循的模式,例如提取食谱信息的复杂 JSON 结构。

📉 **复杂模式下的挑战与解决方案**:随着模式复杂度的增加,模型输出的准确性可能下降。Amazon Nova 的约束解码技术能够动态生成语法,并基于模式定义来限制模型输出,有效解决了模型生成无效键和不正确数据类型的问题,即使是较小的模型也能处理复杂的 JSON 模式。

🚀 **Amazon Nova 的实际应用与优势**:通过工具调用和约束解码,Amazon Nova 能够实现高模型可靠性,并能处理复杂的模式。即使是 Amazon Nova Lite 这样的小模型,也能输出大型复杂的 JSON 模式,而对于图像处理等复杂用例,推荐使用 Nova Pro 或 Nova Premier 以获得最佳性能。开发者可立即在自己的应用中尝试,并在 Amazon Bedrock 控制台中开始构建 AI 应用。

Developers building AI applications face a common challenge: converting unstructured data into structured formats. Structured output is critical for machine-to-machine communication use cases, because this enables downstream use cases to more effectively consume and process the generated outputs. Whether it’s extracting information from documents, creating assistants that fetch data from APIs, or developing agents that take actions, these tasks require foundation models to generate outputs in specific structured formats.

We launched constrained decoding to provide reliability when using tools for structured outputs. Now, tools can be used with Amazon Nova foundation models (FMs) to extract data based on complex schemas, reducing tool use errors by over 95%.

In this post, we explore how you can use Amazon Nova FMs for structured output use cases.

Techniques for implementing structured outputs

When addressing the requirements for structured outputs use cases, there are two common approaches for implementation. You can modify the system prompt or take advantage of tool calling. For example, in a customer support use case, you might want the model to output a JSON with its response to the user and the current sentiment. So, the system prompt would be modified to include the expected structure:

Make sure your final response is valid JSON that follows the below response schema: ##Response schema```json{   "response": "the response to the customer",   "sentiment": "the current customer sentiment"}```

The other option is to provide a tool configuration. Tool calling is the act of providing an API, code function, or schema (or structure) required by your end application to the model through the request schema with the Converse API. This is most used when building agentic applications but is also frequently used in structured output use cases because of the ability to define a set schema that the model should adhere to.

tool_config = {    "tools": [         {            "toolSpec": {                "name": "respondToUser",                "description": "the formatted response to the customer",                "inputSchema": {                    "type": "object",                    "properties": {                        "response": {                            "description": "the response to the customer",                            "type": "string"                        },                        "sentiment": {                            "description": "the current customer sentiment",                            "type": "string"                        }                    },                    "required": [                        "response",                        "sentiment"                    ]                }            }        }    ]}

Both approaches can be effective prompting techniques to influence the model output. However, the output is still non-deterministic and there is room for failure. In our work with customers to implement use cases such as agentic workflows and applications and structured extraction, we’ve observed that the accuracy of the model tends to decrease as the schema becomes more complex.

Structured output with Amazon Nova models

Based on these learnings, we have implemented constrained decoding in our system to help ensure high model reliability in the output generated and to allow the model to handle complex schemas with ease. Constrained decoding relies on a grammar to constrain the possible tokens a model can output at each step. This is differentiated from the prompting techniques historically used, because this changes the actual tokens a model can choose from when generating an output. For example, when closing a JSON object, the model would be constrained to just a } token to select. Constrained decoding is used every time a tool configuration is passed. Because tool use provides us a specific schema already, we can use that to generate a grammar dynamically, based on the schema desired by the developer. Constrained decoding prevents the model from generating invalid keys and enforces correct data types based on the defined schema.

Schema definition process

A key step in using structured outputs with Amazon Nova is to create a tool configuration. The tool configuration provides a standard interface to define the expected output schema. While the primary intent of a tool configuration is to provide external functionality to the model, this JSON interface is used in structured output use cases as well. This can be illustrated using a use case that extracts recipes from online content. To start the integration, we create a tool configuration representing the specific fields we want extracted from the invoices. When creating a tool configuration, it is important to be clear and concise because the property names and descriptions are what inform the model how the fields should be populated.

tool_config = {   "tools": [        {            "toolSpec": {                "name": "extract_recipe",                "description": "Extract recipe for cooking instructions",                "inputSchema": {                    "json": {                        "type": "object",                        "properties": {                            "recipe": {                                "type": "object",                                "properties": {                                    "name": {                                        "type": "string",                                        "description": "Name of the recipe"                                    },                                    "description": {                                        "type": "string",                                        "description": "Brief description of the dish"                                    },                                    "prep_time": {                                        "type": "integer",                                        "description": "Preparation time in minutes"                                    },                                    "cook_time": {                                        "type": "integer",                                        "description": "Cooking time in minutes"                                    },                                    "servings": {                                        "type": "integer",                                        "description": "Number of servings"                                    },                                    "difficulty": {                                        "type": "string",                                        "enum": [                                            "easy",                                            "medium",                                            "hard"                                        ],                                        "description": "Difficulty level of the recipe"                                    },                                    "ingredients": {                                        "type": "array",                                        "items": {                                            "type": "object",                                            "properties": {                                                "name": {                                                    "type": "string",                                                    "description": "Name of ingredient"                                                },                                                "amount": {                                                    "type": "number",                                                    "description": "Quantity of ingredient"                                                },                                                "unit": {                                                    "type": "string",                                                    "description": "Unit of measurement"                                                }                                            },                                            "required": [                                                "name",                                                "amount",                                                "unit"                                            ]                                        }                                    },                                    "instructions": {                                        "type": "array",                                        "items": {                                            "type": "string",                                            "description": "Step-by-step cooking instructions"                                        }                                    },                                    "tags": {                                        "type": "array",                                        "items": {                                            "type": "string",                                            "description": "Categories or labels for the recipe"                                        }                                    }                                },                                "required": [                               ]                            }                        },                        "required": [                        ]                    }                }            }        }    ]}

After the tool configuration has been created, we can pass it through the Converse API along with the recipe, which will be contained in the user prompt. A system prompt is historically required for structured output use cases to guide the model in how to output the content, in this case we can use it to pass details about the system role and persona.

import boto3model_response = client.converse(    modelId="us.amazon.nova-lite-v1:0",   system=[{"text": "You are an expert recipe extractor that compiles recipe details from blog posts"}],    messages=[{"role": "user", "content": content}],    inferenceConfig={"temperature": 0},    toolConfig=tool_config)

By using the native tool use support with constrained decoding, we get a parsed tool call that will follow the correct syntax and expected schema as set in the tool configuration.

{    "toolUse": {        "toolUseId": "tooluse_HDCl-Y8gRa6yWTU-eE97xg",        "name": "extract_recipe",        "input": {            "recipe": {                "name": "Piacenza Tortelli",                "description": "Piacenza tortelli, also known as 'tortelli with the tail' due to their elongated shape, are a delicious fresh pasta, easy to make at home!",                "prep_time": 60,                "cook_time": 10,                "servings": 4,                "difficulty": "hard",                "ingredients": [                    {                        "name": "Type 00 flour",                        "amount": 2.3,                        "unit": "cups"                    },                    {                        "name": "Eggs",                        "amount": 3,                        "unit": ""                    },                    {                        "name": "Fine salt",                        "amount": 1,                        "unit": "pinch"                    },                    {                        "name": "Spinach",                        "amount": 13.3,                        "unit": "cups"                    },                    {                        "name": "Cow's milk ricotta cheese",                        "amount": 1.3,                        "unit": "cups"                    },                    {                        "name": "Parmigiano Reggiano PDO cheese",                        "amount": 4.2,                        "unit": "oz"                    },                    {                        "name": "Fine salt",                        "amount": 1,                        "unit": "to taste"                    },                    {                        "name": "Nutmeg",                        "amount": 1,                        "unit": "to taste"                    },                    {                        "name": "Butter",                        "amount": 80,                        "unit": "g"                    },                    {                        "name": "Sage",                        "amount": 2,                        "unit": "sprigs"                    }                ],                "instructions": [                    "Arrange the flour in a mound and pour the eggs into the center 1; add a pinch of salt and start working with a fork 2, then knead by hand 3.",                    "You should obtain a smooth dough 4; wrap it in plastic wrap and let it rest for half an hour in a cool place.",                    "Meanwhile, prepare the filling starting with the spinach: immerse them in boiling salted water 5 and blanch them for a few minutes until wilted 6.",                    "Drain the spinach and transfer them to cold water 7, preferably with ice. Then squeeze them very well 8 and chop them finely with a knife 9.",                    "Place the chopped spinach in a bowl, add the ricotta 10, salt, pepper, and nutmeg 11. Also add the grated Parmigiano Reggiano DOP 12.",                    "Mix well until you get a homogeneous consistency 13.",                    "At this point, take the dough that has now rested 14, take a portion of it keeping the rest covered. Lightly flatten the dough with a rolling pin 15.",                    "Roll it out with a pasta machine 16; as you reduce the thickness, fold the dough over itself 17 and roll it out again 18.",                    "You should get a very thin rectangle, about 0.04-0.08 inches thick 19. Cut 2 strips of dough by dividing the rectangle in half lengthwise 20, then cut out diamonds of 4 inches 21.",                    "Fill the diamonds with the spinach filling 22 and close them. To do this, bring one of the two longer points inward 23, then fold the two side points towards the center 24.",                    "Now close the tortello by pinching the dough in the center and moving gradually towards the outside 25. The movement is similar to the closure of culurgiones. Continue in this way until the dough and filling are finished 26; you will get about 40-45 pieces.",                    "Place a pot full of salted water on the stove. Meanwhile, in a pan, pour the butter and sage 27. Turn on the heat and let it flavor.",                    "Then cook the tortelli for 5-6 minutes 28, then drain them and toss them in the butter and sage sauce 29.",                    "Plate and serve the Piacenza tortelli with plenty of grated Parmigiano Reggiano DOP 30!"                ],                "tags": [                    "vegetarian",                    "Italian"                ]            }        }    }}

Now, with constrained decoding, we can use a smaller model such as Amazon Nova Lite to output a large and complex JSON schema to use in our application. For image-based use cases with complex schemas, we recommend that you use Nova Pro or Nova Premier for the best performance.

Conclusion

By using structured output with Amazon Nova through tool calling, you can take advantage of the key benefits of constrained decoding and build a reliable system. We encourage you to try this out in your applications today. Learn more at the Amazon Nova User Guide. Get started building your AI applications with Amazon Nova in the Amazon Bedrock console.


About the authors

Jean Farmer is a Generative AI Solutions Architect on the Amazon Artificial General Intelligence (AGI) team, specializing in agentic applications. Based in Seattle, Washington, she works at the intersection of autonomous AI systems and practical business solutions, helping to shape the future of AGI at Amazon.

Mukund Birje is a Sr. Product Marketing Manager on the AIML team at AWS. In his current role he’s focused on driving adoption of Amazon Nova Foundation Models. He has over 10 years of experience in marketing and branding across a variety of industries. Outside of work you can find him hiking, reading, and trying out new restaurants. You can connect with him on LinkedIn.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Amazon Nova 结构化输出 约束解码 AI 应用 工具调用
相关文章