MarkTechPost@AI 2024年10月12日
Google AI Researchers Propose Astute RAG: A Novel RAG Approach to Deal with the Imperfect Retrieval Augmentation and Knowledge Conflicts of LLMs
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

Astute RAG是谷歌云和南加州大学研究者提出的一种独特方法,用于解决检索增强生成中知识冲突的问题。它通过改进的整合机制,有效管理和减轻潜在冲突,提高了LLM在知识密集型任务中的表现。

🎯Astute RAG引入自适应框架,动态调整内部和外部知识的利用。它先从LLM的内部知识获取信息,再与检索到的内容进行源感知整合,通过迭代优化信息源来识别和解决知识冲突。

📈实验结果表明,Astute RAG在TriviaQA、BioASQ和PopQA等多种数据集上表现出色。经过三次整合迭代,在TriviaQA中达到84.45%的准确率,在BioASQ中达到62.24%的准确率,超过了最佳基线RAG方法。

💪Astute RAG在最坏情况下仍能保持高性能,即使所有外部数据都具有误导性,也能展现出强大的鲁棒性和处理知识冲突极端情况的能力。

🔍该研究指出现有RAG系统中不完善检索是失败的重要原因,Astute RAG通过多次迭代成功过滤无关或有害数据,确保LLM生成可靠准确的响应。

Retrieval-augmented generation (RAG) has become a key technique in enhancing the capabilities of LLMs by incorporating external knowledge into their outputs. RAG methods enable LLMs to access additional information from external sources, such as web-based databases, scientific literature, or domain-specific corpora, which improves their performance in knowledge-intensive tasks. RAG systems can generate more contextually accurate responses using internal model knowledge and retrieved external data. Despite its advantages, RAG systems often need help consolidating the retrieved information with internal knowledge, leading to potential conflicts and decreased reliability in model outputs.

When RAG systems retrieve external data, there is always the risk of pulling in irrelevant, outdated, or malicious information. A major challenge associated with RAG is the issue of imperfect retrieval. This issue can lead to inconsistencies and incorrect outputs when the LLM attempts to merge its internal knowledge with flawed external content. For example, studies have shown that up to 70% of retrieved passages in real-world scenarios do not directly contain true answers, resulting in degraded performance of LLMs with RAG augmentation. The problem is exacerbated when LLMs are faced with complex queries or domains where the reliability of external sources is uncertain. To tackle this, the researchers focused on creating a system that can effectively manage and mitigate these conflicts through improved consolidation mechanisms.

Traditional approaches to RAG have included various strategies to enhance retrieval quality and robustness, such as filtering irrelevant data, using multi-agent systems to critique retrieved passages or employing query rewriting techniques. While these methods have shown some effectiveness in improving initial retrieval, they are limited by their inability to handle the inherent conflicts between internal and external information in the post-retrieval stage. As a result, they need to catch up when the quality of retrieved data could be better and consistent, leading to incorrect responses. The research team sought to address this gap by developing a method that filters and selects high-quality data and consolidates conflicting knowledge sources to ensure the final output’s reliability.

Researchers from Google Cloud AI Research and the University of Southern California developed Astute RAG, which introduces a unique approach to tackle the imperfections of retrieval augmentation. The researchers implemented an adaptive framework that dynamically adjusts how internal and external knowledge is utilized. Astute RAG initially elicits information from LLMs’ internal knowledge, which is a complementary source to external data. It then performs source-aware consolidation by comparing internal knowledge with retrieved passages. This process identifies and resolves knowledge conflicts through an iterative refinement of information sources. The final response is determined based on the reliability of consistent data, ensuring that the output is not influenced by incorrect or misleading information.

The experimental results showcased the effectiveness of Astute RAG in diverse datasets such as TriviaQA, BioASQ, and PopQA. On average, the new approach achieved a 6.85% improvement in overall accuracy compared to traditional RAG systems. When the researchers tested Astute RAG under the worst-case scenario, where all retrieved passages were unhelpful or misleading, the method still outperformed other systems by a considerable margin. For instance, while other RAG methods failed to produce accurate outputs in such conditions, Astute RAG reached performance levels close to using only internal model knowledge. This result indicates that Astute RAG effectively overcomes the inherent limitations of existing retrieval-based approaches.

The research’s key takeaways can be summarized as follows:

In conclusion, Astute RAG addresses the critical challenge of knowledge conflicts in retrieval-augmented generation by introducing an adaptive framework that effectively consolidates internal and external information. This approach mitigates the negative effects of imperfect retrieval and enhances the robustness and reliability of LLM responses in real-world applications. The experimental results indicate that Astute RAG is a solution for tackling the limitations of existing RAG systems, particularly in challenging scenarios with unreliable external sources.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 50k+ ML SubReddit

[Upcoming Event- Oct 17 202] RetrieveX – The GenAI Data Retrieval Conference (Promoted)

The post Google AI Researchers Propose Astute RAG: A Novel RAG Approach to Deal with the Imperfect Retrieval Augmentation and Knowledge Conflicts of LLMs appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

Astute RAG 知识冲突 LLM 检索增强
相关文章