神译局是36氪旗下编译团队,关注科技、商业、职场、生活等领域,重点介绍国外的新技术、新观点、新风向。
编者按:2025年是AI智能体元年。本系列文章旨在介绍AI智能体的概念、类型、原理、架构、开发等,为进一步了解AI智能体提供入门知识。本文为系列文章的第五篇,文章来自编译。
在人工智能(AI)不断发展的背景下,理解AI工作流与AI智能体之间的区别对于有效利用这些技术至关重要。本文深入探讨AI工作流的类型,解析智能体的概念,并聚焦二者之间的核心差异。
1. 智能体与工作流
智能体可以有多种定义方式。部分用户将其定义为完全自主的系统,能长期独立运行并使用多种工具完成复杂任务;另一些人则用这一术语指代遵循预定义工作流的规范性实现。
Anthropic公司将这些统称为智能系统,但对工作流和智能体做了关键区分:
工作流:通过预定义代码路径协调LLM和工具的系统。
智能体:由LLM动态自主决策流程和工具使用、全程掌控任务执行方式的系统。
以烹饪场景作为类比:
工作流像严格按食谱步骤操作的烹饪。
智能体则像厨师,会根据食材和口味现场决定如何烹制菜肴。
1.1 何时使用(或不使用)智能体
开发基于LLM的应用时,应从最简单的方案入手,仅在必要时引入复杂性。某些情况下,可能需完全避免使用智能系统。这类系统的任务性能提升常以更高延迟和成本为代价,因此需评估这种权衡是否合理。
当需要更高复杂度时,工作流可为结构明确的任务提供可预测性和一致性,而智能体更适合需要灵活性和大模型驱动决策的场景。不过对于多数应用,通过检索和上下文示例来优化单个LLM调用通常已足够。
1.2 框架的适用场景与使用方法
现有的很多框架都可以简化智能系统的实现,比方说:
LangChain的LangGraph;
Amazon Bedrock的AI智能体框架;
可视化LLM工作流构建工具Rivet;
复杂工作流开发测试平台Vellum。
这些框架简化了调用LLM、定义解析工具、串联调用等底层操作,便于快速搭建智能系统。但这些框架常常会引入额外的抽象层,可能会掩盖底层提示词与响应逻辑,增加调试难度,也可能在不必要场景下怂恿过度的复杂化。
2. 核心构建模块
本节将探讨实际生产环境中智能系统的常见模式。从基础模块——增强型LLM开始,逐步增加复杂度,从简单组合工作流延伸至自主智能体。
2.1 环境配置
可使用任何支持结构化输出和工具调用的聊天模型。以下演示如何安装依赖包、设置API密钥并测试Anthropic模型的工具调用功能:
import os
import getpass
from langchain_anthropic import ChatAnthropic
def _set_env(var: str):
if not os.environ.get(var):
os.environ[var] = getpass.getpass(f"{var}: ")
_set_env("ANTHROPIC_API_KEY")
llm = ChatAnthropic(model="claude-3-5-sonnet-latest")
2.2 增强型LLM
智能系统的基础模块是具备检索、工具调用和记忆等增强能力的LLM。当前模型能主动运用这些能力:生成搜索查询、选择工具、决定保留哪些信息。
# Schema for structured output
from pydantic import BaseModel, Field
class SearchQuery(BaseModel):
search_query: str = Field(None, description="Query that is optimized web search.")
justification: str = Field(
None, description="Why this query is relevant to the user's request."
)
# Augment the LLM with schema for structured output
structured_llm = llm.with_structured_output(SearchQuery)
# Invoke the augmented LLM
output = structured_llm.invoke("How does Calcium CT score relate to high cholesterol?")
# Define a tool
def multiply(a: int, b: int) -> int:
return a * b
# Augment the LLM with tools
llm_with_tools = llm.bind_tools([multiply])
# Invoke the LLM with input that triggers the tool call
msg = llm_with_tools.invoke("What is 2 times 3?")
# Get the tool call
msg.tool_calls
3. 工作流
3.1 提示链式调用
在提示链式调用中,每个LLM调用处理前一个调用的输出结果。
Anthropic博客指出:
提示链式调用是将任务分解为一系列步骤,每个LLM调用处理前一步的输出。开发者可在中间步骤添加程序化检查(如下图的"门控"机制),确保流程正确推进。
适用场景:当任务能清晰拆解为固定子任务时,此工作流最为理想。其核心目标是通过降低每个LLM调用的难度,以稍高的延迟换取更高准确性。
典型案例:
生成营销文案后进行多语言翻译
先编写文档大纲并通过质量检查,再基于大纲撰写完整内容
from typing_extensions import TypedDict
from langgraph.graph import StateGraph, START, END
from IPython.display import Image, display
# Graph state
class State(TypedDict):
topic: str
joke: str
improved_joke: str
final_joke: str
# Nodes
def generate_joke(state: State):
"""First LLM call to generate initial joke"""
msg = llm.invoke(f"Write a short joke about {state['topic']}")
return {"joke": msg.content}
def check_punchline(state: State):
"""Gate function to check if the joke has a punchline"""
# Simple check - does the joke contain "?" or "!"
if "?" in state["joke"] or "!" in state["joke"]:
return "Fail"
return "Pass"
def improve_joke(state: State):
"""Second LLM call to improve the joke"""
msg = llm.invoke(f"Make this joke funnier by adding wordplay: {state['joke']}")
return {"improved_joke": msg.content}
def polish_joke(state: State):
"""Third LLM call for final polish"""
msg = llm.invoke(f"Add a surprising twist to this joke: {state['improved_joke']}")
return {"final_joke": msg.content}
# Build workflow
workflow = StateGraph(State)
# Add nodes
workflow.add_node("generate_joke", generate_joke)
workflow.add_node("improve_joke", improve_joke)
workflow.add_node("polish_joke", polish_joke)
# Add edges to connect nodes
workflow.add_edge(START, "generate_joke")
workflow.add_conditional_edges(
"generate_joke", check_punchline, {"Fail": "improve_joke", "Pass": END}
)
workflow.add_edge("improve_joke", "polish_joke")
workflow.add_edge("polish_joke", END)
# Compile
chain = workflow.compile()
# Show workflow
display(Image(chain.get_graph().draw_mermaid_png()))
# Invoke
state = chain.invoke({"topic": "cats"})
print("Initial joke:")
print(state["joke"])
print("\n--- --- ---\n")
if "improved_joke" in state:
print("Improved joke:")
print(state["improved_joke"])
print("\n--- --- ---\n")
print("Final joke:")
print(state["final_joke"])
else:
print("Joke failed quality gate - no punchline detected!")
3.2 并行化处理
通过并行化处理,多个LLM可协同完成任务:
LLM可通过两种方式并行工作:
分块处理:将任务拆解为独立子任务并行执行
投票机制:多次执行相同任务获取多样化输出
适用场景:
当子任务可并行加速时
需要多视角分析或多次尝试确保结果可靠性时
复杂任务需进行多维度考量时,调用独立LLM专门处理特定维度往往表现更优
典型案例包括以下几种。
分块处理:
安全护栏机制:一个LLM处理用户查询,另一个实时检测违规内容
自动化评估:不同LLM调用分别评估模型输出的不同维度
投票机制:
代码漏洞审查:多个提示词并行检测并标记潜在问题
内容合规审查:不同提示词评估不同风险维度,通过投票阈值平衡误判率
# Graph state
class State(TypedDict):
topic: str
joke: str
story: str
poem: str
combined_output: str
# Nodes
def call_llm_1(state: State):
"""First LLM call to generate initial joke"""
msg = llm.invoke(f"Write a joke about {state['topic']}")
return {"joke": msg.content}
def call_llm_2(state: State):
"""Second LLM call to generate story"""
msg = llm.invoke(f"Write a story about {state['topic']}")
return {"story": msg.content}
def call_llm_3(state: State):
"""Third LLM call to generate poem"""
msg = llm.invoke(f"Write a poem about {state['topic']}")
return {"poem": msg.content}
def aggregator(state: State):
"""Combine the joke and story into a single output"""
combined = f"Here's a story, joke, and poem about {state['topic']}!\n\n"
combined += f"STORY:\n{state['story']}\n\n"
combined += f"JOKE:\n{state['joke']}\n\n"
combined += f"POEM:\n{state['poem']}"
return {"combined_output": combined}
# Build workflow
parallel_builder = StateGraph(State)
# Add nodes
parallel_builder.add_node("call_llm_1", call_llm_1)
parallel_builder.add_node("call_llm_2", call_llm_2)
parallel_builder.add_node("call_llm_3", call_llm_3)
parallel_builder.add_node("aggregator", aggregator)
# Add edges to connect nodes
parallel_builder.add_edge(START, "call_llm_1")
parallel_builder.add_edge(START, "call_llm_2")
parallel_builder.add_edge(START, "call_llm_3")
parallel_builder.add_edge("call_llm_1", "aggregator")
parallel_builder.add_edge("call_llm_2", "aggregator")
parallel_builder.add_edge("call_llm_3", "aggregator")
parallel_builder.add_edge("aggregator", END)
parallel_workflow = parallel_builder.compile()
# Show workflow
display(Image(parallel_workflow.get_graph().draw_mermaid_png()))
# Invoke
state = parallel_workflow.invoke({"topic": "cats"})
print(state["combined_output"])
3.3 路由机制
路由机制对输入进行分类并定向至后续任务。Anthropic博客指出:
路由机制对输入进行分类后,将其引导至专门设计的后续任务。这种工作流实现了关注点分离,并能构建更专业的提示词。若无此机制,针对某类输入的优化可能损害其他输入的模型表现。
适用场景:适用于存在明显分类边界且需独立处理的复杂任务,且分类过程(通过LLM或传统分类模型/算法)需足够精准。
典型案例:
客户服务分流:将不同类型咨询(常规问题、退款请求、技术支持)导向不同的下游流程、提示词和工具
模型资源优化:将简单/常见问题路由至轻量模型(如Claude 3.5 Haiku),困难/罕见问题路由至高性能模型(如Claude 3.5 Sonnet),以平衡成本与速度
from typing_extensions import Literal
from langchain_core.messages import HumanMessage, SystemMessage
# Schema for structured output to use as routing logic
class Route(BaseModel):
step: Literal["poem", "story", "joke"] = Field(
None, description="The next step in the routing process"
)
# Augment the LLM with schema for structured output
router = llm.with_structured_output(Route)
# State
class State(TypedDict):
input: str
decision: str
output: str
# Nodes
def llm_call_1(state: State):
"""Write a story"""
result = llm.invoke(state["input"])
return {"output": result.content}
def llm_call_2(state: State):
"""Write a joke"""
result = llm.invoke(state["input"])
return {"output": result.content}
def llm_call_3(state: State):
"""Write a poem"""
result = llm.invoke(state["input"])
return {"output": result.content}
def llm_call_router(state: State):
"""Route the input to the appropriate node"""
# Run the augmented LLM with structured output to serve as routing logic
decision = router.invoke(
[
SystemMessage(
content="Route the input to story, joke, or poem based on the user's request."
),
HumanMessage(content=state["input"]),
]
)
return {"decision": decision.step}
# Conditional edge function to route to the appropriate node
def route_decision(state: State):
# Return the node name you want to visit next
if state["decision"] == "story":
return "llm_call_1"
elif state["decision"] == "joke":
return "llm_call_2"
elif state["decision"] == "poem":
return "llm_call_3"
# Build workflow
router_builder = StateGraph(State)
# Add nodes
router_builder.add_node("llm_call_1", llm_call_1)
router_builder.add_node("llm_call_2", llm_call_2)
router_builder.add_node("llm_call_3", llm_call_3)
router_builder.add_node("llm_call_router", llm_call_router)
# Add edges to connect nodes
router_builder.add_edge(START, "llm_call_router")
router_builder.add_conditional_edges(
"llm_call_router",
route_decision,
{ # Name returned by route_decision : Name of next node to visit
"llm_call_1": "llm_call_1",
"llm_call_2": "llm_call_2",
"llm_call_3": "llm_call_3",
},
)
router_builder.add_edge("llm_call_1", END)
router_builder.add_edge("llm_call_2", END)
router_builder.add_edge("llm_call_3", END)
# Compile workflow
router_workflow = router_builder.compile()
# Show the workflow
display(Image(router_workflow.get_graph().draw_mermaid_png()))
# Invoke
state = router_workflow.invoke({"input": "Write me a joke about cats"})
print(state["output"])
3.4 协调器-工作节点模式
在协调器-工作节点模式中,协调器将任务分解并分配给各工作节点执行。Anthropic博客指出:
这种工作流通过中心化LLM动态分解任务,将子任务分配给工作节点LLM,并整合它们的执行结果。
适用场景:适用于子任务无法预知的复杂场景(例如代码修改任务中,需变更的文件数量和各文件的修改方式都取决于具体任务)。尽管拓扑结构相似,但与并行化的核心区别在于灵活性——子任务非预设,而是由协调器根据输入动态决策。
典型案例:
需多文件协同修改的编码任务
需从多源收集分析信息的搜索任务
from typing import Annotated, List
import operator
# Schema for structured output to use in planning
class Section(BaseModel):
name: str = Field(
description="Name for this section of the report.",
)
description: str = Field(
description="Brief overview of the main topics and concepts to be covered in this section.",
)
class Sections(BaseModel):
sections: List[Section] = Field(
description="Sections of the report.",
)
# Augment the LLM with schema for structured output
planner = llm.with_structured_output(Sections)
LangGraph如何创建worker
鉴于orchestrator-worker这种工作流很常见,LangGraph已有Send API来予以支持。用户可动态创建worker节点,并给每个节点发送特定输入。每worker都有自己的状态,所有worker的输出均写入到协调器图谱可访问的一个共享的状态key。故协调器可访问所有worker输出,然后可整合成最终输出。
from langgraph.constants import Send
# Graph state
class State(TypedDict):
topic: str # Report topic
sections: list[Section] # List of report sections
completed_sections: Annotated[
list, operator.add
] # All workers write to this key in parallel
final_report: str # Final report
# Worker state
class WorkerState(TypedDict):
section: Section
completed_sections: Annotated[list, operator.add]
# Nodes
def orchestrator(state: State):
"""Orchestrator that generates a plan for the report"""
# Generate queries
report_sections = planner.invoke(
[
SystemMessage(content="Generate a plan for the report."),
HumanMessage(content=f"Here is the report topic: {state['topic']}"),
]
)
return {"sections": report_sections.sections}
def llm_call(state: WorkerState):
"""Worker writes a section of the report"""
# Generate section
section = llm.invoke(
[
SystemMessage(
content="Write a report section following the provided name and description. Include no preamble for each section. Use markdown formatting."
),
HumanMessage(
content=f"Here is the section name: {state['section'].name} and description: {state['section'].description}"
),
]
)
# Write the updated section to completed sections
return {"completed_sections": [section.content]}
def synthesizer(state: State):
"""Synthesize full report from sections"""
# List of completed sections
completed_sections = state["completed_sections"]
# Format completed section to str to use as context for final sections
completed_report_sections = "\n\n---\n\n".join(completed_sections)
return {"final_report": completed_report_sections}
# Conditional edge function to create llm_call workers that each write a section of the report
def assign_workers(state: State):
"""Assign a worker to each section in the plan"""
# Kick off section writing in parallel via Send() API
return [Send("llm_call", {"section": s}) for s in state["sections"]]
# Build workflow
orchestrator_worker_builder = StateGraph(State)
# Add the nodes
orchestrator_worker_builder.add_node("orchestrator", orchestrator)
orchestrator_worker_builder.add_node("llm_call", llm_call)
orchestrator_worker_builder.add_node("synthesizer", synthesizer)
# Add edges to connect nodes
orchestrator_worker_builder.add_edge(START, "orchestrator")
orchestrator_worker_builder.add_conditional_edges(
"orchestrator", assign_workers, ["llm_call"]
)
orchestrator_worker_builder.add_edge("llm_call", "synthesizer")
orchestrator_worker_builder.add_edge("synthesizer", END)
# Compile the workflow
orchestrator_worker = orchestrator_worker_builder.compile()
# Show the workflow
display(Image(orchestrator_worker.get_graph().draw_mermaid_png()))
# Invoke
state = orchestrator_worker.invoke({"topic": "Create a report on LLM scaling laws"})
from IPython.display import Markdown
Markdown(state["final_report"])
3.5 评估器-优化器模式
在评估器-优化器模式中,一个LLM生成响应,另一个LLM在循环中提供评估反馈。
适用场景:当存在明确评估标准且迭代优化能产生显著价值时效果最佳。两个适用标志是:1)人工反馈可明确提升LLM响应质量;2)LLM自身可生成此类反馈。这类似于人类作者润色文档时的迭代过程。
典型案例:
文学翻译(翻译LLM可能遗漏细节,评估LLM可提供改进建议)
需多轮搜索分析的复杂检索任务(评估器决定是否需要进一步搜索)
# Graph state
class State(TypedDict):
joke: str
topic: str
feedback: str
funny_or_not: str
# Schema for structured output to use in evaluation
class Feedback(BaseModel):
grade: Literal["funny", "not funny"] = Field(
description="Decide if the joke is funny or not.",
)
feedback: str = Field(
description="If the joke is not funny, provide feedback on how to improve it.",
)
# Augment the LLM with schema for structured output
evaluator = llm.with_structured_output(Feedback)
# Nodes
def llm_call_generator(state: State):
"""LLM generates a joke"""
if state.get("feedback"):
msg = llm.invoke(
f"Write a joke about {state['topic']} but take into account the feedback: {state['feedback']}"
)
else:
msg = llm.invoke(f"Write a joke about {state['topic']}")
return {"joke": msg.content}
def llm_call_evaluator(state: State):
"""LLM evaluates the joke"""
grade = evaluator.invoke(f"Grade the joke {state['joke']}")
return {"funny_or_not": grade.grade, "feedback": grade.feedback}
# Conditional edge function to route back to joke generator or end based upon feedback from the evaluator
def route_joke(state: State):
"""Route back to joke generator or end based upon feedback from the evaluator"""
if state["funny_or_not"] == "funny":
return "Accepted"
elif state["funny_or_not"] == "not funny":
return "Rejected + Feedback"
# Build workflow
optimizer_builder = StateGraph(State)
# Add the nodes
optimizer_builder.add_node("llm_call_generator", llm_call_generator)
optimizer_builder.add_node("llm_call_evaluator", llm_call_evaluator)
# Add edges to connect nodes
optimizer_builder.add_edge(START, "llm_call_generator")
optimizer_builder.add_edge("llm_call_generator", "llm_call_evaluator")
optimizer_builder.add_conditional_edges(
"llm_call_evaluator",
route_joke,
{ # Name returned by route_joke : Name of next node to visit
"Accepted": END,
"Rejected + Feedback": "llm_call_generator",
},
)
# Compile the workflow
optimizer_workflow = optimizer_builder.compile()
# Show the workflow
display(Image(optimizer_workflow.get_graph().draw_mermaid_png()))
# Invoke
state = optimizer_workflow.invoke({"topic": "Cats"})
print(state["joke"])
4. 智能体
智能体通常通过循环中的环境反馈(借助工具调用)执行操作的LLM实现。Anthropic博客指出:
智能体能处理复杂任务,但其实现往往简洁——本质上是LLM在循环中基于环境反馈调用工具。因此,工具集的清晰设计和文档说明至关重要。
![]()
适用场景:
适用于步骤无法预测且无法硬编码固定路径的开放性问题
LLM可能需多轮交互操作,需对其决策能力有一定信任
自主性使其在可信环境中成为规模化任务的理想选择
注意事项:
自主性伴随更高成本和错误累积风险
建议在沙盒环境中充分测试并设置适当安全护栏
典型案例(来自实际实现):
解决SWE-bench任务的编码智能体(需根据任务描述修改多文件)
Claude的“计算机使用”参考实现(智能体操作计算机完成任务)
from langchain_core.tools import tool
# Define tools
@tool
def multiply(a: int, b: int) -> int:
"""Multiply a and b.
Args:
a: first int
b: second int
"""
return a * b
@tool
def add(a: int, b: int) -> int:
"""Adds a and b.
Args:
a: first int
b: second int
"""
return a + b
@tool
def divide(a: int, b: int) -> float:
"""Divide a and b.
Args:
a: first int
b: second int
"""
return a / b
# Augment the LLM with tools
tools = [add, multiply, divide]
tools_by_name = {tool.name: tool for tool in tools}
llm_with_tools = llm.bind_tools(tools)
5. 模式组合与定制
AI智能体与智能工作流互为补充,可在复杂场景中整合实现最佳效果:
增强自动化:智能体自主处理专项任务,工作流协调任务形成高效流程
可扩展性:结构化工作流整合多智能体,助力企业高效扩展业务
弹性与适应性:智能体应对局部变化,工作流动态调整全局流程
制造业集成案例:
智能制造系统中:
智能体:监控设备性能、预测维护需求、优化生产排程
工作流:统筹原材料采购、生产序列、质量检测与物流,确保端到端流程无缝衔接
6. 结论
LLM领域的成功不在于构建最复杂的系统,而在于构建最适合需求的系统:
从简单提示词入手,通过全面评估优化
仅在简单方案不足时引入多步骤智能系统
核心原则:
设计保持简单
透明展示规划步骤
通过详尽文档和测试打磨人机接口
框架使用建议:
框架可加速开发,但生产环境需减少抽象层级,基于基础组件构建。遵循这些原则,可打造兼具强大能力与可靠性、可维护性和用户信任的智能体系统。
延伸阅读:
译者:boxi。