AWS Machine Learning Blog 21小时前
Choosing the right approach for generative AI-powered structured data retrieval
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文探讨了利用大型语言模型(LLM)在AWS上查询结构化数据的不同模式,旨在帮助用户无需编写SQL或使用BI仪表板,就能直接获取业务问题的答案。文章分析了企业面临的挑战,并介绍了五种不同的解决方案,包括直接对话界面、增强BI工具、结合BI可视化与对话式AI、构建知识库等,以满足不同场景下的需求。

💡直接对话界面:使用Amazon Q Business,企业员工可以通过自然语言提问,直接从结构化和非结构化数据源中获取答案。该模式简化了连接,提供了统一的搜索体验,并内置了身份、角色和权限的识别功能。

📊增强BI工具:通过Amazon Q in QuickSight,用户可以在QuickSight界面内使用自然语言查询,获得可视化的答案,无需编写SQL。这种模式适用于内部BI和分析用例,方便业务分析师和高管快速获取数据洞察。

🤝结合BI可视化与对话式AI:整合Amazon Q in QuickSight和Amazon Q Business,创建一个统一的对话界面,可以同时从结构化数据和非结构化数据中获取信息。这种模式支持在同一对话流程中,同时查询数据库数据和访问文档知识。

📚构建知识库:利用Amazon Bedrock Knowledge Bases,实现结构化数据检索。该模式提供完全托管的文本到SQL功能,无需模型训练,可以直接查询数据源,并支持复杂的分析查询。

⚙️五种模式的总结:文章总结了以上五种模式,包括架构、适用场景、优势、注意事项和实施资源,为用户提供了选择最佳解决方案的框架。

Organizations want direct answers to their business questions without the complexity of writing SQL queries or navigating through business intelligence (BI) dashboards to extract data from structured data stores. Examples of structured data include tables, databases, and data warehouses that conform to a predefined schema. Large language model (LLM)-powered natural language query systems transform how we interact with data, so you can ask questions like “Which region has the highest revenue?” and receive immediate, insightful responses. Implementing these capabilities requires careful consideration of your specific needs—whether you need to integrate knowledge from other systems (for example, unstructured sources like documents), serve internal or external users, handle the analytical complexity of questions, or customize responses for business appropriateness, among other factors.

In this post, we discuss LLM-powered structured data query patterns in AWS. We provide a decision framework to help you select the best pattern for your specific use case.

Business challenge: Making structured data accessible

Organizations have vast amounts of structured data but struggle to make it effectively accessible to non-technical users for several reasons:

Solution overview

An effective solution should provide the following:

In the following sections, we explore five patterns that can address these needs, highlighting the architecture, ideal use cases, benefits, considerations, and implementation resources for each approach.

Pattern 1: Direct conversational interface using an enterprise assistant

This pattern uses Amazon Q Business, a generative AI-powered assistant, to provide a chat interface on data sources with native connectors. When users ask questions in natural language, Amazon Q Business connects to the data source, interprets the question, and retrieves relevant information without requiring intermediate services. The following diagram illustrates this workflow.

This approach is ideal for internal enterprise assistants that need to answer business user-facing questions from both structured and unstructured data sources in a unified experience. For example, HR personnel can ask “What’s our parental leave policy and how many employees used it last quarter?” and receive answers drawn from both leave policy documentation and employee databases together in one interaction. With this pattern, you can benefit from the following:

You can define the scope of data to be pulled in the form of a SQL query. Amazon Q Business pre-indexes database content based on defined SQL queries and uses this index when responding to user questions. Similarly, you can define the sync mode and schedule to determine how often you want to update your index. Amazon Q Business does the heavy lifting of indexing the data using a Retrieval Augmented Generation (RAG) approach and using an LLM to generate well-written answers. For more details on how to set up Amazon Q Business with an Amazon Aurora PostgreSQL-Compatible Edition connector, see Discover insights from your Amazon Aurora PostgreSQL database using the Amazon Q Business connector. You can also refer to the complete list of supported data source connectors.

Pattern 2: Enhancing BI tool with natural language querying capabilities

This pattern uses Amazon Q in QuickSight to process natural language queries against datasets that have been previously configured in Amazon QuickSight. Users can ask questions in everyday language within the QuickSight interface and get visualized answers without writing SQL. This approach works with QuickSight (Enterprise or Q edition) and supports various data sources, including Amazon Relational Database Service (Amazon RDS), Amazon Redshift, Amazon Athena, and others. The architecture is depicted in the following diagram.

This pattern is well-suited for internal BI and analytics use cases. Business analysts, executives, and other employees can ask ad-hoc questions to get immediate visualized insights in the form of dashboards. For example, executives can ask questions like “What were our top 5 regions by revenue last quarter?” and immediately see responsive charts, reducing dependency on analytics teams. The benefits of this pattern are as follows:

Existing QuickSight users might find this the most straightforward way to take advantage of generative AI benefits. You can optimize this pattern for higher-quality results by configuring topics like curated fields, synonyms, and expected question phrasing. This pattern will pull data only from a specific configured data source in QuickSight to produce a dashboard as an output. For more details, check out QuickSight DemoCentral to view a demo in QuickSight, see the generative BI learning dashboard, and view guided instructions to create dashboards with Amazon Q. Also refer to the list of supported data sources.

Pattern 3: Combining BI visualization with conversational AI for a seamless experience

This pattern merges BI visualization capabilities with conversational AI to create a seamless knowledge experience. By integrating Amazon Q in QuickSight with Amazon Q Business (with the QuickSight plugin enabled), organizations can provide users with a unified conversational interface that draws on both unstructured and structured data. The following diagram illustrates the architecture.

This is ideal for enterprises that want an internal AI assistant to answer a variety of questions—whether it’s a metric from a database or knowledge from a document. For example, executives can ask “What was our Q4 revenue growth?” and see visualized results from data warehouses through Amazon Redshift through QuickSight, then immediately follow up with “What is our company vacation policy?” to access HR documentation—all within the same conversation flow. This pattern offers the following benefits:

For more details, see Query structured data from Amazon Q Business using Amazon QuickSight integration and Amazon Q Business now provides insights from your databases and data warehouses (preview).

Another variation of this pattern is recommended for BI users who want to expose unified data through rich visuals in QuickSight, as illustrated in the following diagram.

For more details, see Integrate unstructured data into Amazon Quicksight using Amazon Q Business.

Pattern 4: Building knowledge bases from structured data using managed text-to-SQL

This pattern uses Amazon Bedrock Knowledge Bases to enable structured data retrieval. The service provides a fully managed text-to-SQL module that alleviates common challenges in developing natural language query applications for structured data. This implementation uses Amazon Bedrock (Amazon Bedrock Agents and Amazon Bedrock Knowledge Bases) along with your choice of data warehouse such as Amazon Redshift or Amazon SageMaker Lakehouse. The following diagram illustrates the workflow.

For example, a seller can use this capability embedded into an ecommerce application to ask a complex query like “Give me top 5 products whose sales increased by 50% last year as compared to previous year? Also group the results by product category.” The system automatically generates the appropriate SQL, executes it against the data sources, and delivers results or a summarized narrative. This pattern features the following benefits:

Choose this pattern when you need a flexible, developer-oriented solution. This approach works well for applications (internal or external) where you control the UI design. Default outputs are primarily text or structured data. However, executing arbitrary SQL queries can be a security risk for text-to-SQL applications. It is recommended that you take precautions as needed, such as using restricted roles, read-only databases, and sandboxing. For more information on how to build this pattern, see Empower financial analytics by creating structured knowledge bases using Amazon Bedrock and Amazon Redshift. For a list of supported structured data stores, refer to Create a knowledge base by connecting to a structured data store.

Pattern 5: Custom text-to-SQL implementation with flexible model selection

This pattern represents a build-your-own solution using FMs to convert natural language to SQL, execute queries on data warehouses, and return results. Choose Amazon Bedrock when you want to quickly integrate this capability without deep ML expertise—it offers a fully managed service with ready-to-use FMs through a unified API, handling infrastructure needs with pay-as-you-go pricing. Alternatively, select Amazon SageMaker AI when you require extensive model customization to build specialized needs—it provides complete ML lifecycle tools for data scientists and ML engineers to build, train, and deploy custom models with greater control. For more information, refer to our Amazon Bedrock or Amazon SageMaker AI decision guide. The following diagram illustrates the architecture.

Use this pattern if your use case requires specific open-weight models, or you want to fine-tune models on your domain-specific data. For example, if you need highly accurate results for your query, then you can use this pattern to fine-tune models on specific schema structures, while maintaining the flexibility to integrate with existing workflows and multi-cloud environments. This pattern offers the following benefits:

For more information on steps to build this pattern, see Build a robust text-to-SQL solution generating complex queries, self-correcting, and querying diverse data sources.

Pattern comparison: Making the right choice

To make effective decisions, let’s compare these patterns across key criteria.

Data workload suitability

Different out-of-the-box patterns handle transactional (operational) and analytical (historical or aggregated) data with varying degrees of effectiveness. Patterns 1 and 3, which use Amazon Q Business, work with indexed data and are optimized for lookup-style queries against previously indexed content rather than real-time transactional database queries. Pattern 2, which uses Amazon Q in QuickSight, gets visual output for transactional information for ad-hoc analysis. Pattern 4, which uses Amazon Bedrock structured data retrieval, is specifically designed for analytical systems and data warehouses, excelling at complex queries on large datasets. Pattern 5 is a self-managed text-to-SQL option that can be built to support both transactional or analytical needs of users.

Target audience

Architectures highlighted in Patterns 1, 2, and 3 (using Amazon Q Business, Amazon Q in QuickSight, or a combination) are best suited for internal enterprise use. However, you can use Amazon QuickSight Embedded to embed data visuals, dashboards, and natural language queries into both internal or customer-facing applications. Amazon Q Business serves as an enterprise AI assistant for organizational knowledge that uses subscription-based pricing tiers that is designed for internal employees. Pattern 4 (using Amazon Bedrock) can be used to build both internal as well as customer-facing applications. This is because, unlike the subscription-based model of Amazon Q Business, Amazon Bedrock provides API-driven services that alleviate per-user costs and identity management overhead for external customer scenarios. This makes it well-suited for customer-facing experiences where you need to serve potentially thousands of external users. The custom LLM solutions in Pattern 5 can similarly be tailored to external application requirements.

Interface and output format

Different patterns deliver answers through different interaction models:

The following figure is a comparison matrix of AWS structured data query patterns.

Conclusion

Between these patterns, your optimal choice depends on the following key factors:

By understanding these patterns and their trade-offs, you can architect solutions that align with your business objectives.


About the authors

Akshara Shah is a Senior Solutions Architect at Amazon Web Services. She helps commercial customers build cloud-based generative AI services to meet their business needs. She has been designing, developing, and implementing solutions that leverage AI and ML technologies for more than 10 years. Outside of work, she loves painting, exercising and spending time with family.

Sanghwa Na is a Generative AI Specialist Solutions Architect at Amazon Web Services. Based in San Francisco, he works with customers to design and build generative AI solutions using large language models and foundation models on AWS. He focuses on helping organizations adopt AI technologies that drive real business value

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LLM 结构化数据 AWS 自然语言查询
相关文章