用LangChain.js构建一个RAG应用（二）链式处理和智能体

原文来自：Build a Retrieval Augmented Generation (RAG) App: Part 2

在许多问答应用中，我们希望允许用户进行多轮对话，这意味着应用需要具备某种对历史问答的"记忆"能力，以及将这些信息融入当前思考的逻辑。

这是系列教程的第二部分：

第一部分介绍RAG技术并演示基础实现第二部分（本指南）将扩展实现以支持对话式交互和多轮检索流程

本文重点在于添加整合历史消息的逻辑，这涉及对话历史的管理。

我们将介绍两种方法：

链式处理：最多执行单次检索智能体：允许大模型自主决定执行多次检索

外部知识源将沿用RAG教程第一部分中Lilian Weng的博客文章《LLM驱动的自主智能体》。

起步

组件

选择LLM模型、Embedding模型、向量数据、LangSmith

（参考上一Part内容吧，这里省略了）

链式处理

首先让我们回顾第一部分中构建的向量数据库，该数据库索引了Lilian Weng关于《LLM驱动的自主智能体》的博客文章。

import "cheerio";import { RecursiveCharacterTextSplitter } from "@langchain/textsplitters";import { CheerioWebBaseLoader } from "@langchain/community/document_loaders/web/cheerio";// Load and chunk contents of the blogconst pTagSelector = "p";const cheerioLoader = new CheerioWebBaseLoader(  "https://lilianweng.github.io/posts/2023-06-23-agent/",  {    selector: pTagSelector,  });const docs = await cheerioLoader.load();const splitter = new RecursiveCharacterTextSplitter({  chunkSize: 1000,  chunkOverlap: 200,});const allSplits = await splitter.splitDocuments(docs);

// Index chunks  await vectorStore.addDocuments(allSplits);

在RAG教程的第一部分中，我们将用户输入、检索上下文和生成答案表示为状态中的独立键。对话体验可以自然地通过消息序列来呈现。除了用户和助手的消息外，检索到的文档和其他信息可以通过工具消息融入消息序列。这促使我们采用消息序列来表示RAG应用的状态。具体而言，我们将使用：

HumanMessage

AIMessage

ToolMessage

AIMessage

这种状态模型非常通用，为此LangGraph提供了内置实现：

import { MessagesAnnotation, StateGraph } from "@langchain/langgraph";    const graph = new StateGraph(MessagesAnnotation);

利用工具调用机制实现检索步骤还有另一个优势——检索查询语句是由模型生成的。这在对话场景中尤为重要，因为用户查询可能需要基于聊天历史进行语境化处理。例如以下对话：

用户："什么是任务分解？"
AI："任务分解是指将复杂任务拆分为更小更简单的步骤，使智能体或模型更容易处理。"
用户："常用的分解方法有哪些？"

这种情况下，模型可以生成类似"任务分解的常用方法"这样的查询语句。工具调用机制天然支持这种能力。正如RAG教程中查询分析部分所述，这使得模型能够将用户查询重写为更有效的搜索语句。同时该机制也支持不涉及检索步骤的直接响应（例如回应用户的通用问候语）。

现在让我们将检索步骤封装为工具：

import { z } from "zod";import { tool } from "@langchain/core/tools";const retrieveSchema = z.object({  query: z.string()});const retrieve = tool(  async ({ query }) => {    const retrievedDocs = await vectorStore.similaritySearch(query, 2);        const serialized = retrievedDocs.map(      doc => `来源：${doc.metadata.source}\n内容：${doc.pageContent}`    ).join("\n\n");  // 用双换行分隔不同文档        return [      serialized,      // 格式化后的可读内容      retrievedDocs,   // 原始文档对象    ];  },  {    name: "retrieve",  // 工具名称    description: "检索与查询相关的信息",  // 工具描述    schema: retrieveSchema,  // 参数校验模式    responseFormat: "content_and_artifact",  // 返回内容和原始数据  });

请参阅本指南获取创建工具的详细说明。

我们的图谱将由三个节点组成：

接收用户输入的节点，该节点会生成检索器查询或直接作出响应；执行检索步骤的检索工具节点；利用检索到的上下文生成最终响应的节点。

下面我们将构建这些节点。请注意，这里我们使用了另一个预建的LangGraph组件——ToolNode，该组件会执行工具并将结果以ToolMessage形式添加到状态中。

import {     AIMessage,    HumanMessage,    SystemMessage,    ToolMessage} from "@langchain/core/messages";import { MessagesAnnotation } from "@langchain/langgraph";import { ToolNode } from "@langchain/langgraph/prebuilt";// 第一步：生成可能包含工具调用的AIMessageasync function queryOrRespond(state: typeof MessagesAnnotation.State) {  const llmWithTools = llm.bindTools([retrieve])  const response = await llmWithTools.invoke(state.messages);  // MessagesState会将消息追加到状态而非覆盖  return { messages: [response] };}// 第二步：执行检索const tools = new ToolNode([retrieve]);// 第三步：使用检索内容生成响应async function generate(state: typeof MessagesAnnotation.State) {  // 获取生成的ToolMessages  let recentToolMessages = [];    for (let i = state["messages"].length - 1; i >= 0; i--) {      let message = state["messages"][i];      if (message instanceof ToolMessage) {        recentToolMessages.push(message);      } else {        break;      }    }  let toolMessages = recentToolMessages.reverse();    // 格式化为提示词  const docsContent = toolMessages.map(doc => doc.content).join("\n");  const systemMessageContent =     "你是一个问答任务助手。" +    "请使用以下检索到的上下文回答问题。" +    "若不知道答案，请如实告知。" +    "回答限三句话内，保持简洁。" +    "\n\n" +    `${docsContent}`;  const conversationMessages = state.messages.filter(message =>     message instanceof HumanMessage ||     message instanceof SystemMessage ||     (message instanceof AIMessage && message.tool_calls.length == 0)  );  const prompt = [new SystemMessage(systemMessageContent), ...conversationMessages];  // 执行  const response = await llm.invoke(prompt)  return { messages: [response] };}

最后，我们将应用程序编译为一个单一的 graph 对象。在本例中，我们只是将这些步骤连接成一个顺序流程。同时，我们允许第一个 queryOrRespond 步骤在未生成工具调用时直接“短路”并响应用户。这使得我们的应用程序能够支持对话式交互——例如，在不需要检索步骤的情况下，直接响应通用的问候语。

import { StateGraph } from "@langchain/langgraph";import { toolsCondition } from "@langchain/langgraph/prebuilt";const graphBuilder = new StateGraph(MessagesAnnotation)  .addNode("queryOrRespond", queryOrRespond)  .addNode("tools", tools)  .addNode("generate", generate)  .addEdge("__start__", "queryOrRespond")  .addConditionalEdges(    "queryOrRespond",    toolsCondition,    {__end__: "__end__", tools: "tools"}  )  .addEdge("tools", "generate")  .addEdge("generate", "__end__")const graph = graphBuilder.compile();

import * as tslab from "tslab";const image = await graph.getGraph().drawMermaidPng();const arrayBuffer = await image.arrayBuffer();await tslab.display.png(new Uint8Array(arrayBuffer));

让我们测试一下。

import { BaseMessage, isAIMessage } from "@langchain/core/messages";const prettyPrint = (message: BaseMessage) => {  let txt = `[${message._getType()}]: ${message.content}`;  if (    (isAIMessage(message) && message.tool_calls?.length) ||    0 > 0  ) {    const tool_calls = (message as AIMessage)?.tool_calls      ?.map((tc) => `- ${tc.name}(${JSON.stringify(tc.args)})`)      .join("\n");    txt += ` \nTools: \n${tool_calls}`;  }  console.log(txt);};

let inputs1 = { messages: [{ role: "user", content: "Hello" }] };for await (  const step of await graph.stream(inputs1, {    streamMode: "values",  })) {    const lastMessage = step.messages[step.messages.length - 1];    prettyPrint(lastMessage);    console.log("-----\n");}

[human]: Hello-----[ai]: Hello! How can I assist you today?-----

在执行搜索时，我们可以流式传输各个步骤，以便观察查询生成、检索和答案生成的整个过程：

let inputs2 = { messages: [{ role: "user", content: "检索与查询相关的信息，什么是任务分解？" }] };for await (  const step of await graph.stream(inputs2, {    streamMode: "values",  })) {    const lastMessage = step.messages[step.messages.length - 1];    prettyPrint(lastMessage);    console.log("-----\n");}

[human]: 检索与查询相关的信息，什么是任务分解？-----[ai]:  Tools: - retrieve({"query":"任务分解"})-----[tool]: 来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.ReAct (Yao et al. 2023)-----[ai]: 任务分解是指将复杂任务拆分为多个更小、更简单的步骤，以便更易管理。例如，可以通过LLM提示（如"Steps for XYZ.\n1."）、任务特定指令或人工输入来实现。此外，Tree of Thoughts等方法还探索了每个步骤中的多种推理可能性。-----

查看LangSmith：smith.langchain.com/public/22d1…

聊天历史的状态管理

注意
本教程的这一部分之前使用了 RunnableWithMessageHistory 抽象。您可以在 v0.2 文档中访问该版本的文档。
从 LangChain v0.3 版本开始，我们建议 LangChain 用户利用 LangGraph 持久化将 memory 集成到新的 LangChain 应用程序中。
如果您的代码已经依赖于 RunnableWithMessageHistory 或 BaseChatMessageHistory，您无需进行任何更改。我们不计划在短期内弃用此功能，因为它适用于简单的聊天应用程序，并且任何使用 RunnableWithMessageHistory 的代码将继续按预期工作。
更多详情，请参阅如何迁移到 LangGraph Memory。

在生产环境中，问答应用程序通常会将聊天历史持久化到数据库中，并能够适当地读取和更新它。

LangGraph 实现了一个内置的持久化层，使其成为支持多轮对话的聊天应用程序的理想选择。

为了管理多轮对话和线程，我们只需在编译应用程序时指定一个检查点。由于我们图中的节点会将消息追加到状态中，因此我们将在多次调用之间保留一致的聊天历史。

LangGraph 附带了一个简单的内存检查点，我们在下面使用了它。有关更多详细信息，包括如何使用不同的持久化后端（例如 SQLite 或 Postgres），请参阅其文档。

有关如何管理消息历史的详细指南，请参阅如何添加消息历史（memory）指南。

import { MemorySaver } from "@langchain/langgraph";const checkpointer = new MemorySaver();const graphWithMemory = graphBuilder.compile({ checkpointer });// Specify an ID for the threadconst threadConfig = {    configurable: { thread_id: "abc123" },    streamMode: "values" as const};

我们现在可以像之前一样调用：

let inputs3 = { messages: [{ role: "user", content: "你能检索与查询一些常见的做法吗?" }] };for await (  const step of await graphWithMemory.stream(inputs3, threadConfig)) {    const lastMessage = step.messages[step.messages.length - 1];    prettyPrint(lastMessage);    console.log("-----\n");}

[human]: 你能检索与查询一些常见的做法吗-----[ai]:  Tools: - retrieve({"query":"常见的做法"})-----[tool]: 来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：be provided by other developers (as in Plugins) or self-defined (as in function calls).HuggingGPT (Shen et al. 2023) is a framework to use ChatGPT as the task planner to select models available in HuggingFace platform according to the model descriptions and summarize the response based on the execution results.The system comprises of 4 stages:(1) Task planning: LLM works as the brain and parses the user requests into multiple tasks. There are four attributes associated with each task: task type, ID, dependencies, and arguments. They use few-shot examples to guide LLM to do task parsing and planning.Instruction:(2) Model selection: LLM distributes the tasks to expert models, where the request is framed as a multiple-choice question. LLM is presented with a list of models to choose from. Due to the limited context length, task type based filtration is needed.Instruction:(3) Task execution: Expert models execute on the specific tasks and log results.Instruction:(4) Response generation:来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain-----[ai]: 根据检索到的上下文，以下是任务分解和AI代理设计的常见做法：  1. **任务分解**：     - **Chain-of-Thought (CoT)**：将复杂任务拆解为更小的步骤（如“Steps for XYZ. 1.”）。     - **Tree of Thoughts**：生成多分支推理路径，通过BFS/DFS搜索最优解（如Yao et al. 2023）。  2. **AI代理框架**：     - **HuggingGPT**：用LLM（如ChatGPT）规划任务、选择专家模型并整合结果（Shen et al. 2023）。  3. **外部工具整合**：     - **LLM+P**：结合传统规划器处理长期任务（Liu et al. 2023）。  如需更具体的领域（如编程、写作），可进一步说明。-----

需要注意的是，模型在第二个问题中生成的查询包含了对话上下文。

此处的 LangSmith 追踪信息特别有价值，因为它能让我们清晰看到聊天模型在每一步处理时实际接收到的消息内容。

智能体

智能体利用大语言模型(LLM)的推理能力在执行过程中做出决策。使用代理可以让检索过程获得额外的自主判断能力。虽然它们的行为比上述"链式结构"更难预测，但代理能够为单个查询执行多次检索步骤，或对单次搜索进行迭代优化。

下面我们构建了一个精简的RAG智能体。借助LangGraph的预置ReAct代理构造器，我们只需一行代码即可实现。

如需了解更高级的实现方案，请参阅LangGraph的代理式RAG教程。

import { createReactAgent } from "@langchain/langgraph/prebuilt";const agent = createReactAgent({ llm: llm, tools: [retrieve] });

让我们检查一下节点图片：

与之前实现的关键区别在于：此处工具调用会循环回到最初的LLM调用，而非以最终生成步骤结束运行。这样模型既可以利用检索到的上下文回答问题，也可以生成新的工具调用来获取更多信息。

让我们通过测试来验证这一点。我们构建了一个通常需要多次迭代检索才能回答的问题示例：

let inputMessage = `任务分解的标准方法是什么？在获得答案后，请查找该方法的常见扩展方案。`;let inputs5 = { messages: [{ role: "user", content: inputMessage }] };for await (  const step of await agent.stream(inputs5, {    streamMode: "values",  })) {    const lastMessage = step.messages[step.messages.length - 1];    prettyPrint(lastMessage);    console.log("-----\n");}

[human]: 任务分解的标准方法是什么？在获得答案后，请查找该方法的常见扩展方案。-----[ai]:  Tools: - retrieve({"query":"任务分解的标准方法"})-----[tool]: 来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.ReAct (Yao et al. 2023)-----[ai]: 任务分解的标准方法包括以下几种：1. **思维链（Chain of Thought, CoT）**：   - 将复杂任务分解为多个更小、更简单的步骤。   - 通过逐步推理揭示模型的思考过程。2. **思维树（Tree of Thoughts, ToT）**：   - 扩展了CoT，在每一步探索多种可能的推理路径。   - 将问题分解为多个思维步骤，并为每一步生成多个想法，形成树状结构。   - 搜索过程可以是广度优先搜索（BFS）或深度优先搜索（DFS），并通过分类器或多数投票评估每个状态。3. **任务分解的具体方法**：   - 使用大语言模型（LLM）通过简单提示（如“XYZ的步骤”或“实现XYZ的子目标”）进行分解。   - 使用任务特定指令（如“写一个故事大纲”）。   - 结合人工输入。4. **LLM+P方法**：   - 依赖外部经典规划器进行长期规划。   - 使用规划领域定义语言（PDDL）作为中间接口描述规划问题。   - 步骤包括：将问题转换为“问题PDDL”，请求规划器生成PDDL计划，最后将计划翻译回自然语言。5. **自我反思（Self-reflection）**：   - 通过迭代改进过去的行动决策和纠正错误，提升任务分解能力。6. **ReAct方法**：   - 结合推理（Reasoning）和行动（Action），动态调整任务分解策略。接下来，我将查找这些方法的常见扩展方案。 Tools: - retrieve({"query":"任务分解的标准方法的扩展方案"})-----[tool]: 来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：hard tasks into smaller and simpler steps. CoT transforms big tasks into multiple manageable tasks and shed lights into an interpretation of the model’s thinking process.Tree of Thoughts (Yao et al. 2023) extends CoT by exploring multiple reasoning possibilities at each step. It first decomposes the problem into multiple thought steps and generates multiple thoughts per step, creating a tree structure. The search process can be BFS (breadth-first search) or DFS (depth-first search) with each state evaluated by a classifier (via a prompt) or majority vote.Task decomposition can be done (1) by LLM with simple prompting like "Steps for XYZ.\n1.", "What are the subgoals for achieving XYZ?", (2) by using task-specific instructions; e.g. "Write a story outline." for writing a novel, or (3) with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain来源：https://lilianweng.github.io/posts/2023-06-23-agent/内容：with human inputs.Another quite distinct approach, LLM+P (Liu et al. 2023), involves relying on an external classical planner to do long-horizon planning. This approach utilizes the Planning Domain Definition Language (PDDL) as an intermediate interface to describe the planning problem. In this process, LLM (1) translates the problem into “Problem PDDL”, then (2) requests a classical planner to generate a PDDL plan based on an existing “Domain PDDL”, and finally (3) translates the PDDL plan back into natural language. Essentially, the planning step is outsourced to an external tool, assuming the availability of domain-specific PDDL and a suitable planner which is common in certain robotic setups but not in many other domains.Self-reflection is a vital aspect that allows autonomous agents to improve iteratively by refining past action decisions and correcting previous mistakes. It plays a crucial role in real-world tasks where trial and error are inevitable.ReAct (Yao et al. 2023)-----[ai]: ### 任务分解的标准方法任务分解的标准方法通常包括以下几种：1. **链式思考（Chain of Thought, CoT）**：   - 将复杂任务分解为多个更小、更简单的步骤。   - 通过逐步推理帮助模型更好地理解任务。2. **任务分解的具体方式**：   - **简单提示法**：通过提示词（如“Steps for XYZ.\n1.”或“What are the subgoals for achieving XYZ?”）引导模型分解任务。   - **任务特定指令**：根据任务类型提供特定指令（如“写一个故事大纲”用于小说创作）。   - **人工输入**：依赖人工输入进行任务分解。3. **外部规划工具（LLM+P）**：   - 使用外部经典规划器（如基于PDDL的规划器）进行长期规划。   - 将问题转换为“Problem PDDL”，生成规划后再转回自然语言。### 常见扩展方案1. **树状思考（Tree of Thoughts, ToT）**：   - 扩展CoT，在每一步探索多种可能的推理路径。   - 生成多个“思考”分支，形成树状结构。   - 通过广度优先搜索（BFS）或深度优先搜索（DFS）评估每个状态。2. **自我反思（Self-reflection）**：   - 通过迭代改进过去的决策和纠正错误，提升任务分解的准确性。   - 适用于需要试错的实际任务。3. **ReAct方法**：   - 结合推理（Reasoning）和行动（Action），动态调整任务分解策略。这些扩展方案进一步增强了任务分解的灵活性和适应性，尤其是在复杂或多变的场景中。-----

需要注意，该代理系统实现了以下智能流程：

首先生成查询以搜索任务分解的标准方法获取答案后，自动生成第二个查询来搜索该方法的常见扩展方案在收集到完整上下文后，最终生成问题答案

您可以通过LangSmith追踪记录查看完整的执行步骤，包括各环节延迟时间和其他元数据。

后续步骤

我们已经完成了基础对话式问答应用的构建流程：

使用链式结构开发了可预测的应用，确保每次用户输入最多生成一个查询通过代理系统构建了能迭代执行多轮查询的应用

要探索不同类型的检索器和检索策略，请参阅操作指南中的检索器章节。

如需详细了解LangChain的对话记忆抽象层，请访问如何添加消息历史记录(memory)指南。

要进一步学习智能体，推荐查阅概念指南和LangGraph的智能体架构页面。

起步

组件

链式处理

聊天历史的状态管理

智能体

后续步骤

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签