MarkTechPost@AI 01月29日
Microsoft AI Introduces CoRAG (Chain-of-Retrieval Augmented Generation): An AI Framework for Iterative Retrieval and Reasoning in Knowledge-Intensive Tasks
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

CoRAG是一种新型的检索增强生成框架,它通过迭代检索和推理来提升大型语言模型在知识密集型任务中的表现。与传统的RAG方法不同,CoRAG在生成答案之前会动态地重新构建查询,并使用拒绝采样自动生成中间检索链。该框架在多跳问答和KILT基准测试中都取得了最先进的性能,尤其在复杂推理任务中表现出色,并且通过自适应解码策略平衡了性能与计算效率。CoRAG的出现为构建更可信赖的AI系统提供了新的方向。

🔄 CoRAG的核心在于迭代检索和推理,它能动态调整查询以适应不断变化的推理状态,不同于传统RAG的单次检索。

🧮 CoRAG采用拒绝采样方法自动生成中间检索链,避免了手动标注的需要,并通过多任务学习框架训练模型进行子查询、子答案和最终答案的预测。

🚀 CoRAG在测试阶段使用多种解码策略,如贪婪解码、最佳N采样和树搜索,以优化性能和计算效率之间的平衡。

🏆 CoRAG在多跳问答数据集(如2WikiMultihopQA、HotpotQA)和KILT基准测试中都取得了最先进的性能,尤其在复杂推理任务中表现突出。

Retrieval-Augmented Generation (RAG) is a key technique in enterprise applications that combines large foundation models with external retrieval systems to generate responses that are both accurate and grounded in factual information. Unlike traditional foundation models, which are trained on massive datasets and remain static post-deployment, RAG enhances reliability by incorporating real-time or domain-specific information during the generation process. This integration addresses common issues like hallucinations or gaps in long-tail factual knowledge. RAG systems typically follow a sequential pipeline where retrieved information is provided as input to the generative model, with the overall performance depending heavily on the quality of the retrieval process. To ensure scalability, dense retrievers often use bi-encoder architectures for compressing documents and queries into fixed-size vectors, enabling efficient search algorithms. However, this efficiency comes at the cost of reduced flexibility for handling complex or multi-hop queries, which require iterative reasoning and retrieval steps based on dynamically evolving information.

Recent advancements in RAG have introduced iterative retrieval-generation methods to overcome the limitations of a single retrieval step. Approaches like FLARE and ITER-RETGEN enable models to decide when and what to retrieve during generation, enhancing performance in complex reasoning tasks. Methods like IRCoT adopt chain-of-thought reasoning, refining retrieval steps recursively, while Self-RAG integrates retrieval, generation, and critique for improved factual accuracy. Scaling test-time computing has also been explored to boost RAG performance, with strategies such as retrieving more documents or using long-context LLMs, as seen in LongRAG and IterDRAG. Tree-of-Thought (ToT) and STaR extend reasoning capabilities by leveraging structured exploration and intermediate training states, though these approaches increase token consumption and response latency. Newer methods, like Search-o1, integrate open-source models with active search mechanisms, further advancing RAG’s potential in knowledge-intensive tasks.

Researchers from Microsoft Corporation and the Renmin University of China introduced CoRAG (Chain-of-Retrieval Augmented Generation), a method for training RAG models to iteratively retrieve and reason before generating answers. Unlike conventional RAG systems, CoRAG dynamically reformulates queries based on the evolving reasoning state. The approach uses rejection sampling to augment datasets with intermediate retrieval chains, enabling fine-tuning of open-source models. CoRAG achieves state-of-the-art results on benchmarks like KILT, particularly excelling in multi-hop reasoning tasks by addressing retrieval bottlenecks. It supports diverse decoding strategies, adjusts test-time retrieval dynamically, and demonstrates robustness to varying retriever quality, offering a pathway to more grounded and factual AI models.

The CoRAG framework enhances RAG models through three key components: retrieval chain generation, model training, and test-time scaling strategies. Retrieval chains are generated using rejection sampling, where intermediate sub-queries and sub-answers are iteratively formed, and the chain with the highest log-likelihood score is selected to augment datasets. Using a multi-task learning framework, the model is trained on these augmented datasets for sub-query, sub-answer, and final answer prediction. At test time, decoding strategies like greedy decoding, best-of-N sampling, and tree search allow for controlling token consumption and retrieval steps. These approaches optimize the trade-off between performance and compute efficiency.

The evaluation of CoRAG was conducted using two benchmarks: (1) multi-hop QA datasets, including 2WikiMultihopQA, HotpotQA, Bamboogle, and MuSiQue, to test multi-hop reasoning, and (2) the KILT benchmark for generalization across knowledge-intensive tasks. Fine-tuning was performed on Llama-3.1-8B-Instruct using retrieval chain-augmented datasets. CoRAG-8B significantly outperformed baselines in most multi-hop QA datasets, except Bamboogle, where limited instances and outdated retrieval data caused variability. In the KILT benchmark, CoRAG achieved state-of-the-art performance across tasks, except for FEVER, where a larger model slightly surpassed it. Performance scaling experiments showed improvements with increased retrieval chain lengths and sampling strategies.

In conclusion, the study presents CoRAG, a framework that trains LLMs to retrieve and reason through complex queries iteratively. Unlike traditional RAG methods that rely on a single retrieval step, CoRAG dynamically reformulates queries during retrieval, enhancing accuracy. Intermediate retrieval chains are automatically generated using rejection sampling, eliminating the need for manual annotations. At test time, adaptive decoding strategies balance performance with computational efficiency. CoRAG achieves state-of-the-art results on multi-hop QA datasets and the KILT benchmark, outperforming larger models. Detailed analysis highlights its scaling and generalization capabilities, paving the way for advancing factual, grounded, and trustworthy AI systems in challenging tasks.


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 70k+ ML SubReddit.

[Recommended Read] Nebius AI Studio expands with vision models, new language models, embeddings and LoRA (Promoted)

The post Microsoft AI Introduces CoRAG (Chain-of-Retrieval Augmented Generation): An AI Framework for Iterative Retrieval and Reasoning in Knowledge-Intensive Tasks appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

CoRAG RAG 迭代检索 多跳推理 LLM
相关文章