MarkTechPost@AI 02月14日
Step by Step Guide on How to Build an AI News Summarizer Using Streamlit, Groq and Tavily
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_guoji1.html

 

本文详细介绍了如何使用Streamlit、Groq和Tavily构建一个AI驱动的新闻摘要代理。该代理通过浏览网络、撰写摘要、反思改进和生成标题等步骤,自动搜索并总结特定主题的最新新闻。文章涵盖了环境配置、状态定义、提示设计、AI代理实现以及Streamlit用户界面构建的完整过程,为开发者提供了一个可实践的AI新闻摘要器搭建指南,并提出了进一步改进的方向。

🌐**浏览节点**: 通过生成相关的搜索查询,利用Tavily API从网络上抓取最新的新闻内容,为后续的摘要生成提供数据基础。

✍️**写作节点**: 提取浏览节点获取的内容,使用AI模型生成详细的新闻摘要,力求保证事实的准确性、表达的清晰度和内容的连贯性。

🧐**反思节点**: 对生成的摘要进行评估,对照原始新闻内容,检查是否存在事实错误、遗漏或不准确的细节,并提出改进建议。

✨**优化节点**: 结合反思节点的评估结果和改进建议,对新闻摘要进行优化和润色,提升摘要的质量和准确性。

📰**标题生成节点**: 为最终的新闻摘要生成简短且描述性的标题,方便用户快速了解新闻内容。

Introduction

In this tutorial, we will build an advanced AI-powered news agent that can search the web for the latest news on a given topic and summarize the results. This agent follows a structured workflow:

    Browsing: Generate relevant search queries and collect information from the web.Writing: Extracts and compiles news summaries from the collected information.Reflection: Critiques the summaries by checking for factual correctness and suggests improvements.Refinement: Improves the summaries based on the critique.Headline Generation: Generates appropriate headlines for each news summary.

To enhance usability, we will also create a simple GUI using Streamlit. Similar to previous tutorials, we will use Groq for LLM-based processing and Tavily for web browsing. You can generate free API keys from their respective websites.

Setting Up the Environment

We begin by setting up environment variables, installing the required libraries, and importing necessary dependencies:

Install Required Libraries

pip install langgraph==0.2.53 langgraph-checkpoint==2.0.6 langgraph-sdk==0.1.36 langchain-groq langchain-community langgraph-checkpoint-sqlite==2.0.1 tavily-python streamlit

Import Libraries and Set API Keys

import osimport sqlite3from langgraph.graph import StateGraphfrom langchain_core.messages import SystemMessage, HumanMessagefrom langchain_groq import ChatGroqfrom tavily import TavilyClientfrom langgraph.checkpoint.sqlite import SqliteSaverfrom typing import TypedDict, Listfrom pydantic import BaseModelimport streamlit as st# Set API Keysos.environ['TAVILY_API_KEY'] = "your_tavily_key"os.environ['GROQ_API_KEY'] = "your_groq_key"# Initialize Database for Checkpointingsqlite_conn = sqlite3.connect("checkpoints.sqlite", check_same_thread=False)memory = SqliteSaver(sqlite_conn)# Initialize Model and Tavily Clientmodel = ChatGroq(model="Llama-3.1-8b-instant")tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

Defining the Agent State

The agent maintains state information throughout its workflow:

    Topic: The topic on which user wants the latest news Drafts: The first drafts of the news summaries Content: The research content extracted from the search results of the Tavily Critique: The critique and recommendations generated for the draft in the reflection state. Refined Summaries: Updated news summaries after incorporating suggesstions from Critique 

Headings: Headlines generated for each news article class

class AgentState(TypedDict):    topic: str    drafts: List[str]    content: List[str]    critiques: List[str]    refined_summaries: List[str]    headings: List[str]

Defining Prompts

We define system prompts for each phase of the agent’s workflow:

BROWSING_PROMPT = """You are an AI news researcher tasked with finding the latest news articles on given topics. Generate up to 3 relevant search queries."""WRITER_PROMPT = """You are an AI news summarizer. Write a detailed summary (1 to 2 paragraphs) based on the given content, ensuring factual correctness, clarity, and coherence."""CRITIQUE_PROMPT = """You are a teacher reviewing draft summaries against the source content. Ensure factual correctness, identify missing or incorrect details, and suggest improvements.----------Content: {content}----------"""REFINE_PROMPT = """You are an AI news editor. Given a summary and critique, refine the summary accordingly.-----------Summary: {summary}"""HEADING_GENERATION_PROMPT = """You are an AI news summarizer. Generate a short, descriptive headline for each news summary."""

Structuring Queries and News

We use Pydantic to define the structure of queries and News articles. Pydantic allows us to define the structure of the output of the LLM. This is important because we want the queries to be a list of string and the extracted content from web will have multiple news articles, hence a list of strings.

from pydantic import BaseModelclass Queries(BaseModel):    queries: List[str]class News(BaseModel):    news: List[str]

Implementing the AI Agents

1. Browsing Node

This node generates search queries and retrieves relevant content from the web.

def browsing_node(state: AgentState):    queries = model.with_structured_output(Queries).invoke([        SystemMessage(content=BROWSING_PROMPT),        HumanMessage(content=state['topic'])    ])    content = state.get('content', [])    for q in queries.queries:        response = tavily.search(query=q, max_results=2)        for r in response['results']:            content.append(r['content'])    return {"content": content}

2. Writing Node

Extracts news summaries from the retrieved content.

def writing_node(state: AgentState):    content = "\n\n".join(state['content'])    news = model.with_structured_output(News).invoke([        SystemMessage(content=WRITER_PROMPT),        HumanMessage(content=content)    ])    return {"drafts": news.news}

3. Reflection Node

Critiques the generated summaries against the content.

def reflection_node(state: AgentState):    content = "\n\n".join(state['content'])    critiques = []    for draft in state['drafts']:        response = model.invoke([            SystemMessage(content=CRITIQUE_PROMPT.format(content=content)),            HumanMessage(content="draft: " + draft)        ])        critiques.append(response.content)    return {"critiques": critiques}

4. Refinement Node

Improves the summaries based on critique.

def refine_node(state: AgentState):    refined_summaries = []    for summary, critique in zip(state['drafts'], state['critiques']):        response = model.invoke([            SystemMessage(content=REFINE_PROMPT.format(summary=summary)),            HumanMessage(content="Critique: " + critique)        ])        refined_summaries.append(response.content)    return {"refined_summaries": refined_summaries}

5. Headlines Generation Node

Generates a short headline for each news summary.

def heading_node(state: AgentState):    headings = []    for summary in state['refined_summaries']:        response = model.invoke([            SystemMessage(content=HEADING_GENERATION_PROMPT),            HumanMessage(content=summary)        ])        headings.append(response.content)    return {"headings": headings}

Building the UI with Streamlit

# Define Streamlit appst.title("News Summarization Chatbot")# Initialize session stateif "messages" not in st.session_state:    st.session_state["messages"] = []# Display past messagesfor message in st.session_state["messages"]:    with st.chat_message(message["role"]):        st.markdown(message["content"])# Input field for useruser_input = st.chat_input("Ask about the latest news...")thread = 1if user_input:    st.session_state["messages"].append({"role": "user", "content": user_input})    with st.chat_message("assistant"):        loading_text = st.empty()        loading_text.markdown("Thinking...")        builder = StateGraph(AgentState)        builder.add_node("browser", browsing_node)        builder.add_node("writer", writing_node)        builder.add_node("reflect", reflection_node)        builder.add_node("refine", refine_node)        builder.add_node("heading", heading_node)        builder.set_entry_point("browser")        builder.add_edge("browser", "writer")        builder.add_edge("writer", "reflect")        builder.add_edge("reflect", "refine")        builder.add_edge("refine", "heading")        graph = builder.compile(checkpointer=memory)        config = {"configurable": {"thread_id": f"{thread}"}}        for s in graph.stream({"topic": user_input}, config):            # loading_text.markdown(f"{st.session_state['loading_message']}")            print(s)                s = graph.get_state(config).values        refined_summaries = s['refined_summaries']        headings = s['headings']        thread+=1        # Display final response        loading_text.empty()        response_text = "\n\n".join([f"{h}\n{s}" for h, s in zip(headings, refined_summaries)])        st.markdown(response_text)        st.session_state["messages"].append({"role": "assistant", "content": response_text})

Conclusion

This tutorial covered the entire process of building an AI-powered news summarization agent with a simple Streamlit UI. Now you can play around with this and make some further improvements like:

Happy coding!


Also, feel free to follow us on Twitter and don’t forget to join our 75k+ ML SubReddit.

Recommended Open-Source AI Platform: ‘IntellAgent is a An Open-Source Multi-Agent Framework to Evaluate Complex Conversational AI System(Promoted)

The post Step by Step Guide on How to Build an AI News Summarizer Using Streamlit, Groq and Tavily appeared first on MarkTechPost.

Fish AI Reader

Fish AI Reader

AI辅助创作,多种专业模板,深度分析,高质量内容生成。从观点提取到深度思考,FishAI为您提供全方位的创作支持。新版本引入自定义参数,让您的创作更加个性化和精准。

FishAI

FishAI

鱼阅,AI 时代的下一个智能信息助手,助你摆脱信息焦虑

联系邮箱 441953276@qq.com

相关标签

AI新闻摘要 Streamlit Groq Tavily
相关文章