PaperAgent 2024年09月05日
微软等EfficientRAG:迭代分解Query提升多跳问答效果!
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

EfficientRAG由两个轻量级组件构成,通过多步骤推理解决复杂查询,在多跳问答数据集上表现出色,证明有效信息检索和过滤的重要性。

🎯EfficientRAG由Labeler&Tagger和Filter两个轻量级组件构成,作为标记级别的分类器,用于识别和过滤信息,在迭代RAG系统中运行。

💪EfficientRAG从知识库中检索相关片段并标记,过滤器处理原始问题和标记的串联,生成新查询并持续迭代,提高效率并在多数据集上有良好表现。

📈EfficientRAG在检索最少数量片段时显示可比召回率,端到端问答性能高,证明其在提高检索效率和处理复杂查询上的优势。

🔍实验表明检索增强对提高模型准确率有益,不相关块对LLM生成器构成挑战,EfficientRAGDecompose在检索效率上有优越性。

2024-09-04 21:58 湖北

多跳问答是一类复杂的查询,需要通过多步骤推理来找到答案,这通常超出了单个信息检索回合能够提供的信息范围。

为了解决复杂问题,南大与微软提出EfficientRAG,它由两个轻量级组件构成:Labeler & Tagger和Filter。这两个组件都作为标记级别的分类器,用于识别和过滤信息。

EfficientRAG框架在迭代RAG系统中运行。最初,EfficientRAG从知识库中检索相关片段,将每个片段标记为,并在片段中注释保留的标记"KGOT in the Dimond Center"。然后,过滤器处理原始问题和先前注释的标记的串联,"Q: How large is the shopping mall where KGOT radio station has its studios? Info: KGOT, in the Dimond Center",并注释下一个跳转查询标记"How large is Dimond Center?"。这个迭代过程会一直持续,直到所有片段都被标记为或达到最大迭代次数。

实验结果显示,EfficientRAG在三个开放域多跳问答数据集上超过了现有的RAG方法。

检索性能结果。基线是从源代码实现的。加粗和下划线字体分别表示最佳和次佳结果。EfficientRAG在检索最少数量的片段时显示出可比的召回率。

在三个数据集上的端到端问答性能结果。最高的准确率(Acc)值以粗体突出显示,第二高的则用下划线标注。EfficientRAG展现出有希望的高准确率,与基于大型语言模型(LLM)的基线相当。

EfficientRAG在提高检索效率的同时,还能在复杂查询处理上取得更好的性能。

该方法证明了即使不依赖大型语言模型,通过有效的信息检索和过滤也能实现高效的多跳问答。

EfficientRAG在多跳问答任务中的实验还得出以下结论:

在2WikiMQA数据集上,使用GPT-3.5/GPT-4/Llama3-8B作为生成器时,不同片段设置下的性能表现。

在MuSiQue数据集上,三种检索策略的检索效率召回率。x轴是按对数刻度缩放的。不同线上的每个点代表相同数量的检索片段。

附录

使用Llama-3 8B进行“思维链CoT”问答的详细Prompt,应用于hotpotQA。

CoT Prompting for HotpotQAAs an assistant, your task is to answer the question after <Question>. You should first think step by step about thequestion and give your thought and then answer the <Question>. Your answer should be after <Answer> in JSON formatwith key "thought" and "answer" and their values should be string.There are some examples for you to refer to:<Question>: What is the name of this American musician, singer, actor, comedian, and songwriter, who worked withModern Records and born in December 5, 1932?<Answer>:ˋˋˋ json{{"thought":"Modern Record is a big R&B label with artists including Etta James, Joe Houston, Little Richard, Ike,Tina Turner and John Lee Hooker in the 1950s and 1960s. Little Richard is an American musician, signer actor andsongwriter, born in December 5 1932. So the answer is Little Richard.","answer": "Little Richard"}}ˋˋˋ<Question>: Between Chinua Achebe and Rachel Carson, who had more diverse jobs?<Answer>:ˋˋˋ json{{"thought":"Chinua Achebe was a Nigerian novelist, poet, professor, and critic. Rachel Carson was an Americanmarine biologist, author, and conservationist. Chinua Achebe has 4 jobs while Rachel Carson has 3 jobs. So the answeris Chinua Achebe.","answer": "Chinua Achebe"}}ˋˋˋ<Question>: Remember Me Ballin’ is a CD single by Indo G that features an American rapper born in what year?<Answer>:ˋˋˋ json{{"thought":"Remember Me Ballin’ is the CD singer by Indo G that features Gangsta Boo, who is named Lola Mitchell,an American rapper born in 1979. So the answer is 1979.","answer": "1979"}}ˋˋˋNow your Question is<Question>: {question}<Answer>:

适用于所有数据集的多跳问题分解的详细Prompt

Question Decomposition Prompt You are assigned a multi-hop question decomposition task.You should decompose the given multi-hop question into multiple single-hop questions, and such that you can answereach single-hop question independently.Your response must be wrapped with ˋˋˋjson and ˋˋˋ.You should answer in JSON format, your answer must contain the following keys:- "decomposed_questions": a list of strings, each string is a single-hop question.Here are some examples for your reference:## Examples<Multi-hop question>: Which film came out first, The Love Route or Engal Aasan?Your response:ˋˋˋjson{{ "decomposed_questions": [ "When does the film The Love Route come out?", "When does the film Engal Aasancome out?" ] }}ˋˋˋ<Multi-hop question>: Where did the spouse of Moderen’s composer die?Your response:ˋˋˋjson{{ "decomposed_questions": [ "Who is Modern’s composer?", "Who is the spouse of Carl Nielsen?", "In what place didAnne Marie Carl-Nielsen die?" ] }}ˋˋˋ<Multi-hop question>: Where was the director of film The Fascist born?Your response:ˋˋˋjson{{ "decomposed_questions": [ "Who is the director of film The Fascist?", "Where was Luciano Salce born?" ] }}ˋˋˋ## Now it’s your turn:<Multi-hop question>: {question}Your response:
EfficientRAG: Efficient Retriever for Multi-Hop Question Answeringhttps://arxiv.org/pdf/2408.04259

推荐阅读


欢迎关注我的公众号“PaperAgent”,每天一篇大模型(LLM)文章来锻炼我们的思维,简单的例子,不简单的方法,提升自己。

跳转微信打开

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

EfficientRAG 多跳问答 信息检索 复杂查询
相关文章