MarkTechPost@AI 前天 05:26
LangGraph Tutorial: A Step-by-Step Guide to Creating a Text Analysis Pipeline
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了LangGraph这一由LangChain推出的强大框架,它允许开发者通过图结构构建具有状态和多智能体的LLM应用。文章通过一个具体的文本分析流程,展示了如何利用LangGraph实现文本分类、实体提取和文本摘要等功能。同时,还演示了如何通过添加情感分析节点和引入条件边来增强代理的能力,使其能够根据文本内容动态调整处理流程,从而提高效率并降低成本。LangGraph为构建复杂、模块化且可扩展的AI系统提供了一个直观且高效的解决方案。

✨ LangGraph是一个基于图结构的框架,用于创建有状态的、多智能体的LLM应用,它为设计AI代理的思考和行动流程提供了核心工具。

🚀 该框架支持状态管理、灵活路由、持久化和可视化等关键特性,能够构建模块化、可扩展的自然语言处理工作流,例如本文展示的文本分析管道。

🧠 文章通过一个包含文本分类、实体提取和文本摘要的示例,生动地展示了LangGraph如何将多个LLM能力串联起来,实现类似人类的文本理解过程,每个步骤的输出都为下一个步骤提供上下文。

💡 引入条件边是LangGraph的一大亮点,它允许代理根据当前状态(如文本分类结果)动态地决定执行路径,例如只在特定类型文本上执行实体提取,从而实现更智能、高效和经济的AI系统。

🔧 通过增加情感分析节点和实验条件边,文章进一步展示了LangGraph的高度可扩展性,开发者可以轻松地为AI代理添加新功能或优化现有流程,以适应更复杂的应用场景。

Estimated reading time: 5 minutes

Introduction to LangGraph

LangGraph is a powerful framework by LangChain designed for creating stateful, multi-actor applications with LLMs. It provides the structure and tools needed to build sophisticated AI agents through a graph-based approach.

Think of LangGraph as an architect’s drafting table – it gives us the tools to design how our agent will think and act. Just as an architect draws blueprints showing how different rooms connect and how people will flow through a building, LangGraph lets us design how different capabilities will connect and how information will flow through our agent.

Key Features:

In this tutorial, we’ll demonstrate LangGraph by building a multi-step text analysis pipeline that processes text through three stages:

    Text Classification: Categorize input text into predefined categoriesEntity Extraction: Identify key entities from the textText Summarization: Generate a concise summary of the input text

This pipeline showcases how LangGraph can be used to create a modular, extensible workflow for natural language processing tasks.

Setting Up Our Environment

Before diving into the code, let’s set up our development environment.

Installation

# Install required packages!pip install langgraph langchain langchain-openai python-dotenv

Setting Up API Keys

We’ll need an OpenAI API key to use their models. If you haven’t already, you can get one from https://platform.openai.com/signup.

Check out the Full Codes here

import osfrom dotenv import load_dotenv# Load environment variables from .env file (create this with your API key)load_dotenv()# Set OpenAI API keyos.environ["OPENAI_API_KEY"] = os.getenv('OPENAI_API_KEY')

Testing Our Setup

Let’s make sure our environment is working correctly by creating a simple test with the OpenAI model:

from langchain_openai import ChatOpenAI# Initialize the ChatOpenAI instancellm = ChatOpenAI(model="gpt-4o-mini")# Test the setupresponse = llm.invoke("Hello! Are you working?")print(response.content)

Building Our Text Analysis Pipeline

Now let’s import the necessary packages for our LangGraph text analysis pipeline:

import osfrom typing import TypedDict, List, Annotatedfrom langgraph.graph import StateGraph, ENDfrom langchain.prompts import PromptTemplatefrom langchain_openai import ChatOpenAIfrom langchain.schema import HumanMessagefrom langchain_core.runnables.graph import MermaidDrawMethodfrom IPython.display import display, Image

Designing Our Agent’s Memory

Just as human intelligence requires memory, our agent needs a way to keep track of information. We create this using a TypedDict to define our state structure: Check out the Full Codes here

class State(TypedDict):    text: str    classification: str    entities: List[str]    summary: str# Initialize our language model with temperature=0 for more deterministic outputsllm = ChatOpenAI(model="gpt-4o-mini", temperature=0)

Creating Our Agent’s Core Capabilities

Now we’ll create the actual skills our agent will use. Each of these capabilities is implemented as a function that performs a specific type of analysis. Check out the Full Codes here

1. Classification Node

def classification_node(state: State):    '''Classify the text into one of the categories: News, Blog, Research, or Other'''    prompt = PromptTemplate(        input_variables=["text"],        template="Classify the following text into one of the categories: News, Blog, Research, or Other.\n\nText:{text}\n\nCategory:"    )    message = HumanMessage(content=prompt.format(text=state["text"]))    classification = llm.invoke([message]).content.strip()    return {"classification": classification}

2. Entity Extraction Node

def entity_extraction_node(state: State):    '''Extract all the entities (Person, Organization, Location) from the text'''    prompt = PromptTemplate(        input_variables=["text"],        template="Extract all the entities (Person, Organization, Location) from the following text. Provide the result as a comma-separated list.\n\nText:{text}\n\nEntities:"    )    message = HumanMessage(content=prompt.format(text=state["text"]))    entities = llm.invoke([message]).content.strip().split(", ")    return {"entities": entities}

3. Summarization Node

def summarization_node(state: State):    '''Summarize the text in one short sentence'''    prompt = PromptTemplate(        input_variables=["text"],        template="Summarize the following text in one short sentence.\n\nText:{text}\n\nSummary:"    )    message = HumanMessage(content=prompt.format(text=state["text"]))    summary = llm.invoke([message]).content.strip()    return {"summary": summary}

Bringing It All Together

Now comes the most exciting part – connecting these capabilities into a coordinated system using LangGraph:

Check out the Full Codes here

# Create our StateGraphworkflow = StateGraph(State)# Add nodes to the graphworkflow.add_node("classification_node", classification_node)workflow.add_node("entity_extraction", entity_extraction_node)workflow.add_node("summarization", summarization_node)# Add edges to the graphworkflow.set_entry_point("classification_node")  # Set the entry point of the graphworkflow.add_edge("classification_node", "entity_extraction")workflow.add_edge("entity_extraction", "summarization")workflow.add_edge("summarization", END)# Compile the graphapp = workflow.compile()

Workflow Structure: Our pipeline follows this path:
classification_node → entity_extraction → summarization → END

Testing Our Agent

Now that we’ve built our agent, let’s see how it performs with a real-world text example:

Check out the Full Codes here

sample_text = """ OpenAI has announced the GPT-4 model, which is a large multimodal model that exhibits human-level performance on various professional benchmarks. It is developed to improve the alignment and safety of AI systems. Additionally, the model is designed to be more efficient and scalable than its predecessor, GPT-3. The GPT-4 model is expected to be released in the coming months and will be available to the public for research and development purposes. """ state_input = {"text": sample_text} result = app.invoke(state_input) print("Classification:", result["classification"]) print("\nEntities:", result["entities"]) print("\nSummary:", result["summary"])Classification: News Entities: ['OpenAI', 'GPT-4', 'GPT-3'] Summary: OpenAI's upcoming GPT-4 model is a multimodal AI that aims for human-level performance and improved safety, efficiency, and scalability compared to GPT-3.

Understanding the Power of Coordinated Processing

What makes this result particularly impressive isn’t just the individual outputs – it’s how each step builds on the others to create a complete understanding of the text.

    The classification provides context that helps frame our understanding of the text typeThe entity extraction identifies important names and conceptsThe summarization distills the essence of the document

This mirrors human reading comprehension, where we naturally form an understanding of what kind of text it is, note important names and concepts, and form a mental summary – all while maintaining the relationships between these different aspects of understanding.

Try with Your Own Text

Now let’s try our pipeline with another text sample:

Check out the Full Codes here

# Replace this with your own text to analyze your_text = """ The recent advancements in quantum computing have opened new possibilities for cryptography and data security. Researchers at MIT and Google have demonstrated quantum algorithms that could potentially break current encryption methods. However, they are also developing new quantum-resistant encryption techniques to protect data in the future. """ # Process the text through our pipeline your_result = app.invoke({"text": your_text}) print("Classification:", your_result["classification"]) print("\nEntities:", your_result["entities"]) print("\nSummary:", your_result["summary"])Classification: Research Entities: ['MIT', 'Google'] Summary: Recent advancements in quantum computing may threaten current encryption methods while also prompting the development of new quantum-resistant techniques.

Adding More Capabilities (Advanced)

One of the powerful aspects of LangGraph is how easily we can extend our agent with new capabilities. Let’s add a sentiment analysis node to our pipeline:

Check out the Full Codes here

# First, let's update our State to include sentimentclass EnhancedState(TypedDict):    text: str    classification: str    entities: List[str]    summary: str    sentiment: str# Create our sentiment analysis nodedef sentiment_node(state: EnhancedState):    '''Analyze the sentiment of the text: Positive, Negative, or Neutral'''    prompt = PromptTemplate(        input_variables=["text"],        template="Analyze the sentiment of the following text. Is it Positive, Negative, or Neutral?\n\nText:{text}\n\nSentiment:"    )    message = HumanMessage(content=prompt.format(text=state["text"]))    sentiment = llm.invoke([message]).content.strip()    return {"sentiment": sentiment}# Create a new workflow with the enhanced stateenhanced_workflow = StateGraph(EnhancedState)# Add the existing nodesenhanced_workflow.add_node("classification_node", classification_node)enhanced_workflow.add_node("entity_extraction", entity_extraction_node)enhanced_workflow.add_node("summarization", summarization_node)# Add our new sentiment nodeenhanced_workflow.add_node("sentiment_analysis", sentiment_node)# Create a more complex workflow with branchesenhanced_workflow.set_entry_point("classification_node")enhanced_workflow.add_edge("classification_node", "entity_extraction")enhanced_workflow.add_edge("entity_extraction", "summarization")enhanced_workflow.add_edge("summarization", "sentiment_analysis")enhanced_workflow.add_edge("sentiment_analysis", END)# Compile the enhanced graphenhanced_app = enhanced_workflow.compile()

Testing the Enhanced Agent

# Try the enhanced pipeline with the same textenhanced_result = enhanced_app.invoke({"text": sample_text})print("Classification:", enhanced_result["classification"])print("\nEntities:", enhanced_result["entities"])print("\nSummary:", enhanced_result["summary"])print("\nSentiment:", enhanced_result["sentiment"])
Classification: NewsEntities: ['OpenAI', 'GPT-4', 'GPT-3']Summary: OpenAI's upcoming GPT-4 model is a multimodal AI that aims for human-level performance and improved safety, efficiency, and scalability compared to GPT-3.Sentiment: The sentiment of the text is Positive. It highlights the advancements and improvements of the GPT-4 model, emphasizing its human-level performance, efficiency, scalability, and the positive implications for AI alignment and safety. The anticipation of its release for public use further contributes to the positive tone.

Adding Conditional Edges (Advanced Logic)

Why Conditional Edges?

So far, our graph has followed a fixed linear path: classification_node → entity_extraction → summarization → (sentiment)

But in real-world applications, we often want to run certain steps only if needed. For example:

LangGraph makes this easy through conditional edges – logic gates that dynamically route execution based on data in the current state.

Check out the Full Codes here

Creating a Routing Function

# Route after classificationdef route_after_classification(state: EnhancedState) -> str:    category = state["classification"].lower() # returns: "news", "blog", "research", "other"    return category in ["news", "research"]

Define the Conditional Graph

from langgraph.graph import StateGraph, ENDconditional_workflow = StateGraph(EnhancedState)# Add nodesconditional_workflow.add_node("classification_node", classification_node)conditional_workflow.add_node("entity_extraction", entity_extraction_node)conditional_workflow.add_node("summarization", summarization_node)conditional_workflow.add_node("sentiment_analysis", sentiment_node)# Set entry pointconditional_workflow.set_entry_point("classification_node")# Add conditional edgeconditional_workflow.add_conditional_edges("classification_node", route_after_classification, path_map={    True: "entity_extraction",    False: "summarization"})# Add remaining static edgesconditional_workflow.add_edge("entity_extraction", "summarization")conditional_workflow.add_edge("summarization", "sentiment_analysis")conditional_workflow.add_edge("sentiment_analysis", END)# Compileconditional_app = conditional_workflow.compile()

Testing the Conditional Pipeline

test_text = """OpenAI released the GPT-4 model with enhanced performance on academic and professional tasks. It's seen as a major breakthrough in alignment and reasoning capabilities."""result = conditional_app.invoke({"text": test_text})print("Classification:", result["classification"])print("Entities:", result.get("entities", "Skipped"))print("Summary:", result["summary"])print("Sentiment:", result["sentiment"])
Classification: NewsEntities: ['OpenAI', 'GPT-4']Summary: OpenAI's GPT-4 model significantly improves performance in academic and professional tasks, marking a breakthrough in alignment and reasoning.Sentiment: The sentiment of the text is Positive. It highlights the release of the GPT-4 model as a significant advancement, emphasizing its enhanced performance and breakthrough capabilities.

Check out the Full Codes here

Now try it with a Blog:

blog_text = """Here's what I learned from a week of meditating in silence. No phones, no talking—just me, my breath, and some deep realizations."""result = conditional_app.invoke({"text": blog_text})print("Classification:", result["classification"])print("Entities:", result.get("entities", "Skipped (not applicable)"))print("Summary:", result["summary"])print("Sentiment:", result["sentiment"])
Classification: BlogEntities: Skipped (not applicable)Summary: A week of silent meditation led to profound personal insights.Sentiment: The sentiment of the text is Positive. The mention of "deep realizations" and the overall reflective nature of the experience suggests a beneficial and enlightening outcome from the meditation practice.

With conditional edges, our agent can now:

    Make decisions based on contextSkip unnecessary stepsRun faster and cheaperBehave more intelligently

Conclusion

In this tutorial, we’ve:

    Explored LangGraph concepts and its graph-based approachBuilt a text processing pipeline with classification, entity extraction, and summarizationEnhanced our pipeline with additional capabilitiesIntroduced conditional edges to dynamically control the flow based on classification resultsVisualized our workflowTested our agent with real-world text examples

LangGraph provides a powerful framework for creating AI agents by modeling them as graphs of capabilities. This approach makes it easy to design, modify, and extend complex AI systems.

Next Steps


Check out the Full Codes here. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

You may also like NVIDIA’s Open Sourced Cosmos DiffusionRenderer [Check it now]

The post LangGraph Tutorial: A Step-by-Step Guide to Creating a Text Analysis Pipeline appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

LangGraph LangChain AI代理 文本分析 LLM
相关文章