MarkTechPost@AI 2024年12月08日
Retrieval-Augmented Reasoning Enhancement (RARE): A Novel Approach to Factual Reasoning in Medical and Commonsense Domains
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

RARE 是一种新型检索增强推理框架,旨在提高大型语言模型在医学和常识领域的推理准确性和事实完整性。它通过在蒙特卡洛树搜索框架内引入查询生成和子问题细化两种机制,并结合检索增强事实评分器,有效提升了模型的推理能力。RARE 无需额外训练或微调,即可在多个基准测试中超越现有方法,展现了强大的性能和适应性。

🧠RARE 框架采用两阶段架构:候选生成阶段利用基于 MCTS 的自生成器,动态获取上下文相关的外部信息;事实评估阶段采用检索增强事实评分器 (RAFC) 评估候选推理轨迹,选择事实性得分最高的作为最终答案。

🩺RARE 在医学和常识推理任务中表现出色,在 MedQA、MedMCQA 和 MMLU-Medical 等医学推理基准测试中,RARE 对 LLaMA3.2 3B 模型的性能提升分别达到 2.59%、2.35% 和 1.66%。

💡在常识推理评估中,RARE 对 LLaMA3.1 8B 模型的性能提升同样显著,在 StrategyQA、CommonsenseQA、Social IQA 和 Physical IQA 上分别提升了 6.45%、4.26%、2.1% 和 1.85%。

🚀RARE 的关键优势在于无需额外模型训练或微调,即可在各种任务中实现稳健且适应性强的性能,为未来复杂推理领域的研究提供了新的方向。

🔍RARE的局限性在于仅在LLaMA 3.1等开源模型上测试,尚未在GPT-4等大型专有模型上验证;另外,它目前仅使用MCTS来探索动作路径,并未利用训练过的奖励模型来动态指导搜索过程。

Question answering (QA) emerged as a critical task in natural language processing, designed to generate precise answers to complex queries across diverse domains. Within this, medical QA poses unique challenges, focusing on the complex nature of healthcare information processing. Medical scenarios demand complex reasoning capabilities beyond simple information retrieval, as models must handle these scenarios and produce context-aware responses. The task involves synthesizing patient information, analyzing medical conditions, and proposing evidence-based interventions through structured, multi-step reasoning. Traditional QA systems face challenges to meet the specialized demands of the medical domain, which involve intricate decision-making processes.

Existing research has explored various methodologies to enhance LLMs reasoning capabilities across multiple domains. Prompting techniques like Chain-of-Thought have emerged as prominent approaches to improve inference capabilities through carefully designed reasoning sequences. Another method, Monte Carlo Tree Search (MCTS) has shown potential in optimizing solution paths by enhancing exploration efficiency and decision-making quality across domains like game theory and strategic planning. Retrieval-augmented generation (RAG) techniques have shown promise in medical contexts, enabling LLMs to ground reasoning in up-to-date documents. However, developing comprehensive reasoning frameworks that handle complex, multi-step medical scenarios remains a significant challenge.

Researchers from the University of Massachusetts Amherst, University of Massachusetts Medical School, Worcester, University of Massachusetts Lowell, and VA Bedford Health Care have proposed RARE (Retrieval-Augmented Reasoning Enhancement) to enhance reasoning accuracy and factual integrity across LLMs for complex, knowledge-intensive tasks such as medical and commonsense reasoning. The approach incorporates two actions within the MCTS framework: a query generation mechanism for information retrieval and a sub-question refinement strategy. By using contextual information and implementing a Retrieval-Augmented Factuality Scorer (RAFC), RARE enhances reasoning accuracy, maintaining high standards of factual integrity. It has a significant advancement in computational reasoning, offering a scalable solution that enables open-source LLMs to compete with top-tier closed-source models.

The RARE framework introduces a complex two-stage architecture to enhance reasoning accuracy through retrieval-augmented mechanisms. The first stage, Candidate Generation, uses a retrieval-augmented generator that builds upon the MCTS-based self-generator approach. This generator dynamically uses two retrieval-augmented actions that fetch contextually relevant external information, improving the relevance and precision of candidate reasoning trajectories. The second stage, Factuality Evaluation, replaces traditional discriminator models with the RAFC. This innovative scorer evaluates candidate trajectories having the highest factuality score selected as the final answer. These trajectories prioritize reasoning paths with robust factual support and enhance overall response.

RARE shows remarkable performance across medical and commonsense reasoning tasks, outperforming existing baseline methodologies. The framework consistently improves performance across different LLaMA model sizes in medical reasoning benchmarks. For the LLaMA3.2 3B model, RARE delivers notable performance gains, including a 2.59% improvement on MedQA, 2.35% enhancement on MedMCQA, and 1.66% increase on MMLU-Medical compared to the rStar baseline. Commonsense reasoning evaluations further validate RARE’s effectiveness, where RARE achieves impressive gains on the LLaMA3.1 8B model, including a 6.45% improvement in StrategyQA, 4.26% enhancement in CommonsenseQA, 2.1% increase in Social IQA, and 1.85% boost in Physical IQA.

In conclusion, researchers introduced RARE which represents a significant advancement in enhancing LLMs’ reasoning capabilities through innovative retrieval-augmented techniques. This method shows remarkable potential in addressing complex reasoning challenges across medical and commonsense domains by introducing autonomous reasoning actions and a sophisticated factuality scoring mechanism. Its key strength lies in its ability to operate without requiring additional model training or fine-tuning, ensuring robust and adaptable performance across diverse tasks. Future research could explore extending RARE’s approach to additional complex reasoning domains and refining retrieval-augmented reasoning techniques.

There are some limitations of RARE as well:


Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 60k+ ML SubReddit.

[Must Attend Webinar]: ‘Transform proofs-of-concept into production-ready AI applications and agents’ (Promoted)

The post Retrieval-Augmented Reasoning Enhancement (RARE): A Novel Approach to Factual Reasoning in Medical and Commonsense Domains appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

检索增强推理 大型语言模型 医学推理 常识推理 蒙特卡洛树搜索
相关文章