Step by Step Guide on How to Build an AI News Summarizer Using Streamlit, Groq and Tavily

Introduction

In this tutorial, we will build an advanced AI-powered news agent that can search the web for the latest news on a given topic and summarize the results. This agent follows a structured workflow:

Browsing

Writing

Reflection

Refinement

Headline Generation

To enhance usability, we will also create a simple GUI using Streamlit. Similar to previous tutorials, we will use Groq for LLM-based processing and Tavily for web browsing. You can generate free API keys from their respective websites.

Setting Up the Environment

We begin by setting up environment variables, installing the required libraries, and importing necessary dependencies:

Install Required Libraries

Copy CodeCopiedUse a different Browser

pip install langgraph==0.2.53 langgraph-checkpoint==2.0.6 langgraph-sdk==0.1.36 langchain-groq langchain-community langgraph-checkpoint-sqlite==2.0.1 tavily-python streamlit

Import Libraries and Set API Keys

Copy CodeCopiedUse a different Browser

import osimport sqlite3from langgraph.graph import StateGraphfrom langchain_core.messages import SystemMessage, HumanMessagefrom langchain_groq import ChatGroqfrom tavily import TavilyClientfrom langgraph.checkpoint.sqlite import SqliteSaverfrom typing import TypedDict, Listfrom pydantic import BaseModelimport streamlit as st# Set API Keysos.environ['TAVILY_API_KEY'] = "your_tavily_key"os.environ['GROQ_API_KEY'] = "your_groq_key"# Initialize Database for Checkpointingsqlite_conn = sqlite3.connect("checkpoints.sqlite", check_same_thread=False)memory = SqliteSaver(sqlite_conn)# Initialize Model and Tavily Clientmodel = ChatGroq(model="Llama-3.1-8b-instant")tavily = TavilyClient(api_key=os.environ["TAVILY_API_KEY"])

Defining the Agent State

The agent maintains state information throughout its workflow:

Topic

Content

Critique

Refined Summaries

Headings: Headlines generated for each news article class

Copy CodeCopiedUse a different Browser

class AgentState(TypedDict):    topic: str    drafts: List[str]    content: List[str]    critiques: List[str]    refined_summaries: List[str]    headings: List[str]

Defining Prompts

We define system prompts for each phase of the agent’s workflow:

Copy CodeCopiedUse a different Browser

BROWSING_PROMPT = """You are an AI news researcher tasked with finding the latest news articles on given topics. Generate up to 3 relevant search queries."""WRITER_PROMPT = """You are an AI news summarizer. Write a detailed summary (1 to 2 paragraphs) based on the given content, ensuring factual correctness, clarity, and coherence."""CRITIQUE_PROMPT = """You are a teacher reviewing draft summaries against the source content. Ensure factual correctness, identify missing or incorrect details, and suggest improvements.----------Content: {content}----------"""REFINE_PROMPT = """You are an AI news editor. Given a summary and critique, refine the summary accordingly.-----------Summary: {summary}"""HEADING_GENERATION_PROMPT = """You are an AI news summarizer. Generate a short, descriptive headline for each news summary."""

Structuring Queries and News

We use Pydantic to define the structure of queries and News articles. Pydantic allows us to define the structure of the output of the LLM. This is important because we want the queries to be a list of string and the extracted content from web will have multiple news articles, hence a list of strings.

Copy CodeCopiedUse a different Browser

from pydantic import BaseModelclass Queries(BaseModel):    queries: List[str]class News(BaseModel):    news: List[str]

Implementing the AI Agents

1. Browsing Node

This node generates search queries and retrieves relevant content from the web.

Copy CodeCopiedUse a different Browser

def browsing_node(state: AgentState):    queries = model.with_structured_output(Queries).invoke([        SystemMessage(content=BROWSING_PROMPT),        HumanMessage(content=state['topic'])    ])    content = state.get('content', [])    for q in queries.queries:        response = tavily.search(query=q, max_results=2)        for r in response['results']:            content.append(r['content'])    return {"content": content}

2. Writing Node

Extracts news summaries from the retrieved content.

Copy CodeCopiedUse a different Browser

def writing_node(state: AgentState):    content = "\n\n".join(state['content'])    news = model.with_structured_output(News).invoke([        SystemMessage(content=WRITER_PROMPT),        HumanMessage(content=content)    ])    return {"drafts": news.news}

3. Reflection Node

Critiques the generated summaries against the content.

Copy CodeCopiedUse a different Browser

def reflection_node(state: AgentState):    content = "\n\n".join(state['content'])    critiques = []    for draft in state['drafts']:        response = model.invoke([            SystemMessage(content=CRITIQUE_PROMPT.format(content=content)),            HumanMessage(content="draft: " + draft)        ])        critiques.append(response.content)    return {"critiques": critiques}

4. Refinement Node

Improves the summaries based on critique.

Copy CodeCopiedUse a different Browser

def refine_node(state: AgentState):    refined_summaries = []    for summary, critique in zip(state['drafts'], state['critiques']):        response = model.invoke([            SystemMessage(content=REFINE_PROMPT.format(summary=summary)),            HumanMessage(content="Critique: " + critique)        ])        refined_summaries.append(response.content)    return {"refined_summaries": refined_summaries}

5. Headlines Generation Node

Generates a short headline for each news summary.

Copy CodeCopiedUse a different Browser

def heading_node(state: AgentState):    headings = []    for summary in state['refined_summaries']:        response = model.invoke([            SystemMessage(content=HEADING_GENERATION_PROMPT),            HumanMessage(content=summary)        ])        headings.append(response.content)    return {"headings": headings}

Building the UI with Streamlit

Copy CodeCopiedUse a different Browser

# Define Streamlit appst.title("News Summarization Chatbot")# Initialize session stateif "messages" not in st.session_state:    st.session_state["messages"] = []# Display past messagesfor message in st.session_state["messages"]:    with st.chat_message(message["role"]):        st.markdown(message["content"])# Input field for useruser_input = st.chat_input("Ask about the latest news...")thread = 1if user_input:    st.session_state["messages"].append({"role": "user", "content": user_input})    with st.chat_message("assistant"):        loading_text = st.empty()        loading_text.markdown("Thinking...")        builder = StateGraph(AgentState)        builder.add_node("browser", browsing_node)        builder.add_node("writer", writing_node)        builder.add_node("reflect", reflection_node)        builder.add_node("refine", refine_node)        builder.add_node("heading", heading_node)        builder.set_entry_point("browser")        builder.add_edge("browser", "writer")        builder.add_edge("writer", "reflect")        builder.add_edge("reflect", "refine")        builder.add_edge("refine", "heading")        graph = builder.compile(checkpointer=memory)        config = {"configurable": {"thread_id": f"{thread}"}}        for s in graph.stream({"topic": user_input}, config):            # loading_text.markdown(f"{st.session_state['loading_message']}")            print(s)                s = graph.get_state(config).values        refined_summaries = s['refined_summaries']        headings = s['headings']        thread+=1        # Display final response        loading_text.empty()        response_text = "\n\n".join([f"{h}\n{s}" for h, s in zip(headings, refined_summaries)])        st.markdown(response_text)        st.session_state["messages"].append({"role": "assistant", "content": response_text})

Conclusion

This tutorial covered the entire process of building an AI-powered news summarization agent with a simple Streamlit UI. Now you can play around with this and make some further improvements like:

better GUI

Iterative refinement

Happy coding!

Also, feel free to follow us on Twitter and don’t forget to join our 75k+ ML SubReddit.

The post Step by Step Guide on How to Build an AI News Summarizer Using Streamlit, Groq and Tavily appeared first on MarkTechPost.

Introduction

Setting Up the Environment

Defining the Agent State

Defining Prompts

Structuring Queries and News

Implementing the AI Agents

1. Browsing Node

2. Writing Node

3. Reflection Node

4. Refinement Node

5. Headlines Generation Node

Building the UI with Streamlit

Conclusion

Fish AI Reader

FishAI

联系邮箱 441953276@qq.com

相关标签