掘金 人工智能 07月28日 11:23
ReAct: 减少 LLM 幻觉,提升准确度
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

大型语言模型(LLM)在推理和执行方面各有擅长,但往往难以兼顾。ReAct框架通过交错生成推理轨迹和任务执行,使LLM能够调用外部工具获取信息,从而显著减少幻觉,提高响应的准确性和可靠性。它通过一个“思考-行动-暂停-观察”的循环机制,让LLM能够主动查询维基百科、执行计算或搜索特定博客,并将外部反馈整合到下一步的决策中。这种方法不仅提升了LLM的输出质量,也增强了其过程的可解释性和用户的信任度。然而,为了避免无限循环,建议在ReAct框架中设置最大迭代次数。

💡 **ReAct框架的核心机制**:ReAct框架通过“思考(Thought)、行动(Action)、暂停(PAUSE)、观察(Observation)”的循环,将LLM的推理与外部工具的执行相结合。LLM首先进行思考,然后决定采取的行动(如计算或搜索),执行后获得观察结果,再根据观察结果进行下一步思考,直至最终输出答案。

🛠️ **LLM调用外部工具的能力**:ReAct框架赋予LLM调用外部工具(如维基百科、计算器、自定义搜索等)的能力,使其能够获取真实的外部信息。例如,通过`wikipedia: France`可以获取法国的信息,通过`calculate: 4 * 7 / 3`可以执行数学计算,这极大地扩展了LLM的应用范围,并能减少模型“幻觉”的产生。

📈 **提升LLM的准确性与可解释性**:通过与外部工具的交互,ReAct框架能够引入真实世界的数据,从而显著提高LLM响应的准确性和可靠性。同时,输出的“思考”和“行动”过程使得LLM的工作流程更加透明,增强了用户对其行为的理解和信任。

🔄 **避免循环与设置迭代上限**:ReAct框架虽然强大,但也存在陷入无限循环的风险。为了应对这种情况,建议在框架设计中设置一个最大迭代步骤(MAX_Iteration_Step),以防止模型进行过度或无效的迭代,确保整个过程的效率和可控性。

Summary

LLM 现在的发展是日异月新,但是目前还是有一个关键的问题:LLM 很擅长推理,也很擅长执行,但是没办法同时将两者结合起来。

ReAct framework 就是以一种交错的方式生成推理轨迹和特定任务的执行,大模型就是根据生成的推理轨迹来制定下一步的执行计划。ReAct framework 就可以使得 LLM 调用外部工具获取外部信息,这样就可以避免 LLM 幻觉,产生更加可靠,更加准确的响应。同时,ReAct 输出了推理过程和执行的操作,提高了对 LLM 可解释性和信任度

What is ReAct

下面这个是分离了推理和执行的过程

而 ReAct 就是将两者结合在了一起,这样就能够保证在执行操作时,有足够的真实信息,一定程度提升准确度:

How ReAct

ReAct framework 就是可以使 LLM 采取额外的行为,比如:访问 Goole,运行计算等,获取得到额外的信息,告诉程序如何执行这些操作,并且将操作的结果再返回给 LLM,由 LLM 决定下一步的动作。下面是一个具体的例子:

具体的代码如下:

import openaiimport reimport httpxopenai.api_key = "sk-..." class ChatBot:    def __init__(self, system=""):        self.system = system        self.messages = []        if self.system:            self.messages.append({"role": "system", "content": system})        def __call__(self, message):        self.messages.append({"role": "user", "content": message})        result = self.execute()        self.messages.append({"role": "assistant", "content": result})        return result        def execute(self):        completion = openai.ChatCompletion.create(model="gpt-3.5-turbo", messages=self.messages)        # Uncomment this to print out token usage each time, e.g.        # {"completion_tokens": 86, "prompt_tokens": 26, "total_tokens": 112}        # print(completion.usage)        return completion.choices[0].message.contentprompt = """You run in a loop of Thought, Action, PAUSE, Observation.At the end of the loop you output an AnswerUse Thought to describe your thoughts about the question you have been asked.Use Action to run one of the actions available to you - then return PAUSE.Observation will be the result of running those actions.Your available actions are:calculate:e.g. calculate: 4 * 7 / 3Runs a calculation and returns the number - uses Python so be sure to use floating point syntax if necessarywikipedia:e.g. wikipedia: DjangoReturns a summary from searching Wikipediasimon_blog_search:e.g. simon_blog_search: DjangoSearch Simon's blog for that termAlways look things up on Wikipedia if you have the opportunity to do so.Example session:Question: What is the capital of France?Thought: I should look up France on WikipediaAction: wikipedia: FrancePAUSEYou will be called again with this:Observation: France is a country. The capital is Paris.You then output:Answer: The capital of France is Paris""".strip()action_re = re.compile('^Action: (\w+): (.*)$')def query(question, max_turns=5):    i = 0    bot = ChatBot(prompt)    next_prompt = question    while i < max_turns:        i += 1        result = bot(next_prompt)        print(result)        actions = [action_re.match(a) for a in result.split('\n') if action_re.match(a)]        if actions:            # There is an action to run            action, action_input = actions[0].groups()            if action not in known_actions:                raise Exception("Unknown action: {}: {}".format(action, action_input))            print(" -- running {} {}".format(action, action_input))            observation = known_actions[action](action_input)            print("Observation:", observation)            next_prompt = "Observation: {}".format(observation)        else:            returndef wikipedia(q):    return httpx.get("<https://en.wikipedia.org/w/api.php>", params={        "action": "query",        "list": "search",        "srsearch": q,        "format": "json"    }).json()["query"]["search"][0]["snippet"]def simon_blog_search(q):    results = httpx.get("<https://datasette.simonwillison.net/simonwillisonblog.json>", params={        "sql": """        select          blog_entry.title || ': ' || substr(html_strip_tags(blog_entry.body), 0, 1000) as text,          blog_entry.created        from          blog_entry join blog_entry_fts on blog_entry.rowid = blog_entry_fts.rowid        where          blog_entry_fts match escape_fts(:q)        order by          blog_entry_fts.rank        limit          1""".strip(),        "_shape": "array",        "q": q,    }).json()    return results[0]["text"]def calculate(what):    return eval(what)known_actions = {    "wikipedia": wikipedia,    "calculate": calculate,    "simon_blog_search": simon_blog_search}## 执行:query("Fifteen * twenty five")Thought: The action required is a calculationAction: calculate: 15 * 25PAUSE -- running calculate 15 * 25Observation: 375Answer: Fifteen times twenty five equals 375.query("What does England share borders with?")Thought: I should list down the neighboring countries of EnglandAction: wikipedia: EnglandPAUSE -- running wikipedia EnglandObservation: <span class="searchmatch">England</span> is a country that is part of the United Kingdom. It shares land borders with Wales to its west and Scotland to its north. The Irish Sea lies northwestAnswer: England shares borders with Wales and Scotland.

Ending

ReAct 是一个反复调用 LLM 进行推断,执行,观察的过程,相比较于 Function Calling,ReAct 能通过多次迭代收集足够的外部真实信息,以此来减少 LLM 的幻觉,提高响应的准确度。但是, ReAct 也很有可能陷入到循环当中,最好是在循环中设置一个 MAX_Iteration_Step,防止出现过度的迭代。

参考:react-lm.github.io/

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

ReAct框架 大型语言模型 LLM AI推理 工具调用
相关文章