To Data & Beyond

To Data & Beyond

Share this post

To Data & Beyond
To Data & Beyond
Building Agents with LangGraph Course #5: Persistence & Streaming in LangGraph

Building Agents with LangGraph Course #5: Persistence & Streaming in LangGraph

Youssef Hosni's avatar
Youssef Hosni
Aug 13, 2025
∙ Paid
26

Share this post

To Data & Beyond
To Data & Beyond
Building Agents with LangGraph Course #5: Persistence & Streaming in LangGraph
2
Share

Get 90% off for 1 year

When you start building complex AI agents designed for more than just a single turn of conversation, you’ll quickly run into two essential concepts: persistence and streaming. These features are the bedrock of creating robust, production-ready applications that can handle long-running tasks, maintain context, and provide a transparent user experience.

Persistence is the ability to save the state of your agent at any point in time. This allows you to pause and resume conversations, giving your agent a form of memory. It’s crucial for multi-turn dialogues and for enabling more advanced features like human-in-the-loop interventions.

Streaming provides a real-time view into your agent’s operations. Instead of waiting for a final answer, you can see the intermediate steps — like which tools are being called — or even the final answer as it’s being generated, token by token.

In this hands-on tutorial, which is the fifth part of the Hands-on LangGraph course, we’ll take a basic LangGraph agent and enhance it with these powerful capabilities.

Table of Contents:

  1. Setting Up the Agent

  2. Adding Persistence with Checkpointers

  3. Streaming Intermediate Steps

  4. Streaming Tokens in Real Time

This article is the Fifth Article in the ongoing series of Building LLM Agents with LangGraph:

  • Introduction to Agents & LangGraph (Published!)

  • Building Simple ReAct Agent from Scratch (Published!)

  • Main Building Units of LangGraph (Published!)

  • Agentic Search Tools in LangGraph (Published!)

  • Persistence and Streaming in LangGraph (You are here!)

  • Human in the Loop in LLM Agents (Coming Soon!)

  • Putting it All Together! Building Essay Writer Agent (Coming Soon!)

This series is designed to take readers from foundational knowledge to advanced practices in building LLM agents with LangGraph.

Each article delves into essential components, such as constructing simple ReAct agents from scratch, leveraging LangGraph’s building units, utilizing agentic search tools, implementing persistence and streaming capabilities, integrating human-in-the-loop interactions, and culminating in the creation of a fully functional essay-writing agent.

By the end of this series, you will have a comprehensive understanding of LangGraph, practical skills to design and deploy LLM agents, and the confidence to build customized AI-driven workflows tailored to diverse applications.

Get 90% off for 1 year


Get All My Books, One Button Away With 40% Off

Youssef Hosni
·
Jun 17
Get All My Books, One Button Away With 40% Off

I have created a bundle for my books and roadmaps, so you can buy everything with just one button and for 40% less than the original price. The bundle features 8 eBooks, including:

Read full story

1. Setting Up the Agent

First, let’s get our initial agent set up. This will be the same research assistant agent from previous examples, equipped with a search tool. We begin by loading our environment variables and making the necessary imports.

from dotenv import load_dotenv
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults

_ = load_dotenv()

Next, we’ll initialize our search tool and define the structure of our agent’s state. The state will simply contain a list of messages that we’ll append to over time.

tool = TavilySearchResults(max_results=2)
class AgentState(TypedDict):
    messages: Annotated[list[AnyMessage], operator.add]

2. Adding Persistence with Checkpointers

To add persistence, LangGraph introduces the concept of a checkpointer. A checkpointer automatically saves a snapshot of the graph’s state after every step (i.e., after each node is executed). This allows the agent to pick up right where it left off in a future interaction.

For this guide, we’ll use SqliteSaver, a straightforward checkpointer that uses a SQLite database. By passing “:memory:”, we create a temporary in-memory database that is perfect for testing. For production, you could easily point this to a file-based database or use more robust checkpointers for Postgres or Redis.

from langgraph.checkpoint.sqlite import SqliteSaver

memory = SqliteSaver.from_conn_string(":memory:")

Now, we need to modify our Agent class to accept this checkpointer when the graph is compiled. We’ll add a checkpointer parameter to the __init__ method and pass it to graph.compile().

class Agent:
    def __init__(self, model, tools, checkpointer, system=""):
        self.system = system
        graph = StateGraph(AgentState)
        graph.add_node("llm", self.call_openai)
        graph.add_node("action", self.take_action)
        graph.add_conditional_edges("llm", self.exists_action, {True: "action", False: END})
        graph.add_edge("action", "llm")
        graph.set_entry_point("llm")
        # Pass the checkpointer to the compile method
        self.graph = graph.compile(checkpointer=checkpointer)
        self.tools = {t.name: t for t in tools}
        self.model = model.bind_tools(tools)

    def call_openai(self, state: AgentState):
        messages = state['messages']
        if self.system:
            messages = [SystemMessage(content=self.system)] + messages
        message = self.model.invoke(messages)
        return {'messages': [message]}

    def exists_action(self, state: AgentState):
        result = state['messages'][-1]
        return len(result.tool_calls) > 0

    def take_action(self, state: AgentState):
        tool_calls = state['messages'][-1].tool_calls
        results = []
        for t in tool_calls:
            print(f"Calling: {t}")
            result = self.tools[t['name']].invoke(t['args'])
            results.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))
        print("Back to the model!")
        return {'messages': results}

With our modified class, we can now instantiate our agent, passing in the memory object we created.

prompt = """You are a smart research assistant. Use the search engine to look up information. \
You are allowed to make multiple calls (either together or in sequence). \
Only look up information when you are sure of what you want. \
If you need to look up some information before asking a follow up question, you are allowed to do that!
"""
model = ChatOpenAI(model="gpt-4o")
abot = Agent(model, [tool], system=prompt, checkpointer=memory)

3. Streaming Intermediate Steps

With our checkpointer in place, let’s see how to manage conversational history. To do this, we need to introduce the concept of a thread. A thread is a unique identifier for a conversation, allowing the checkpointer to manage multiple conversations independently.

Get 90% off for 1 year

Keep reading with a 7-day free trial

Subscribe to To Data & Beyond to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Youssef Hosni
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share