MarkTechPost@AI 2024年10月17日
Google AI Researchers Introduced a Set of New Methods for Enhancing Long-Context LLM Performance in Retrieval-Augmented Generation
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

谷歌云 AI 和伊利诺伊大学的研究人员共同开发了一系列新方法,以提高检索增强生成 (RAG) 系统在使用长文本大型语言模型 (LLM) 时的稳健性和性能。这些方法旨在减轻“硬负样本”的影响,这些负样本看似相关但实际上会引入噪音,导致 LLM 在生成正确答案时出现错误。研究人员提出了一种无训练方法,称为检索重排序,它通过将最相关的段落置于输入序列的开头和结尾来改善段落的排列顺序,从而使 LLM 能够更有效地处理信息。此外,他们还引入了基于训练的方法,包括隐式稳健微调和显式相关性微调,进一步增强了模型处理无关数据的能力。这些方法在各种数据集上都取得了显著的性能提升,证明了它们在实际应用中的有效性。

🤔 **检索重排序:** 针对长文本 LLM 中出现的“丢失在中间”现象,研究人员提出了一种无训练方法,称为检索重排序。该方法通过将最相关的段落置于输入序列的开头和结尾来调整段落的排列顺序,从而使 LLM 能够更好地关注最关键的信息,提高模型的准确性。例如,在 Natural Questions 数据集上的实验表明,使用检索重排序技术,即使检索到的段落数量增加,模型的准确率也能保持较高水平。

💪 **显式相关性微调:** 研究人员还开发了一种基于训练的方法,称为显式相关性微调,它训练 LLM 能够主动分析检索到的文档,并在生成答案之前识别出最相关的段落。这种方法通过增强 LLM 区分复杂多文档环境中重要信息和无关信息的能力,进一步提高了模型的准确性。

🧠 **隐式稳健微调:** 为了使 LLM 更加稳健地处理含有噪声和潜在误导性信息的训练数据,研究人员引入了隐式稳健微调方法。这种方法通过训练 LLM 在包含噪声数据的环境中进行学习,从而使模型在实际应用中更能抵抗噪音的影响。

📈 **性能提升:** 研究表明,这些方法在各种数据集上都取得了显著的性能提升,包括 Natural Questions 和 PopQA 数据集。例如,在 Gemma-2-9B-Chat 模型上,使用检索重排序技术处理大型检索集,模型的准确率提高了 5%,证明了该方法在实际应用中的有效性。

💡 **关键结论:** 研究表明,通过应用检索重排序、显式相关性微调和隐式稳健微调等方法,可以有效提高长文本 LLM 在检索增强生成系统中的性能,并使其更能抵抗噪声和误导性信息的影响,从而提升模型的准确性和稳健性。

Large language models (LLMs) have revolutionized various fields by enabling more effective data processing, complex problem-solving, and natural language understanding. One major innovation is retrieval-augmented generation (RAG), which allows LLMs to retrieve relevant information from external sources, such as large knowledge databases, to generate better answers. However, the integration of long-context LLMs with RAG presents certain challenges. Specifically, while LLMs are becoming capable of handling longer input sequences, the increase in retrieved information can overwhelm the system. The challenge lies in making sure that the additional context improves the accuracy of the LLM’s outputs rather than confusing the model with irrelevant information. 

The problem faced by long-context LLMs stems from a phenomenon where increasing the number of retrieved passages does not necessarily improve performance. Instead, it often leads to performance degradation, primarily due to including irrelevant or misleading documents known as “hard negatives.” These hard negatives appear relevant based on certain retrieval criteria but introduce noise that misguides the LLM in generating the correct answer. As a result, the model’s accuracy declines despite having access to more information. This is particularly problematic for knowledge-intensive tasks where correctly identifying relevant information is crucial.

Existing RAG systems employ a retriever to select the most relevant passages from a database, which the LLM then processes. Standard RAG implementations, however, typically limit the number of retrieved passages to around ten. This works well for shorter contexts but only scales efficiently when the number of passages increases. The issue becomes more pronounced when dealing with complex datasets with multiple relevant passages. Current approaches must adequately address the risks of introducing misleading or irrelevant information, which can diminish the quality of LLM responses.

Researchers from Google Cloud AI and the University of Illinois introduced innovative methods to improve the robustness and performance of RAG systems when using long-context LLMs. Their approach encompasses training-free and training-based methods designed to mitigate the impact of hard negatives. One of the key innovations is retrieval reordering, a training-free method that improves the sequence in which the retrieved passages are fed to the LLM. The researchers propose prioritizing passages with higher relevance scores at the beginning and end of the input sequence, thus focusing the LLM’s attention on the most important information. Also, training-based methods were introduced to enhance further the model’s ability to handle irrelevant data. These include implicit robustness fine-tuning and explicit relevance fine-tuning, both of which train the LLM to discern relevant information better and filter out misleading content.

Retrieval reordering is a relatively simple but effective approach that addresses the “lost-in-the-middle” phenomenon commonly observed in LLMs, where the model tends to focus more on the beginning and end of an input sequence while losing attention to the middle portions. By restructuring the input so that highly relevant information is placed at the edges of the sequence, the researchers improved the model’s ability to generate accurate responses. In addition, they explored implicit fine-tuning, which involves training the LLM with datasets containing noisy and potentially misleading information. This method encourages the model to become more resilient to such noise, making it more robust in practical applications. Explicit relevance fine-tuning goes one step further by teaching the LLM to actively analyze retrieved documents and identify the most relevant passages before generating an answer. This method enhances the LLM’s ability to distinguish between valuable and irrelevant information in complex, multi-document contexts.

The proposed methods demonstrated notable improvements in accuracy and robustness. The research showed that retrieval reordering improved the LLM’s accuracy by several percentage points, particularly when handling large sets of retrieved passages. For example, experiments on the Natural Questions dataset showed that increasing the number of retrieved passages initially improved accuracy. Still, performance declined after a certain point when hard negatives became too prevalent. The introduction of reordering and fine-tuning mitigated this issue, maintaining higher accuracy even as the number of passages increased. Notably, the accuracy with the Gemma-2-9B-Chat model improved by 5% when the reordering technique was applied to larger retrieval sets, demonstrating the technique’s effectiveness in real-world scenarios.

Key Takeaways from the Research:

In conclusion, this research offers practical solutions to the challenges of long-context LLMs in RAG systems. By introducing innovative methods like retrieval reordering and fine-tuning approaches, the researchers have demonstrated a scalable way to enhance the accuracy and robustness of these systems, making them more reliable for handling complex, real-world data.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 50k+ ML SubReddit.

[Upcoming Live Webinar- Oct 29, 2024] The Best Platform for Serving Fine-Tuned Models: Predibase Inference Engine (Promoted)

The post Google AI Researchers Introduced a Set of New Methods for Enhancing Long-Context LLM Performance in Retrieval-Augmented Generation appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

大型语言模型 检索增强生成 长文本 硬负样本 谷歌 AI
相关文章