When you start building complex AI agents designed for more than just a single turn of conversation, you’ll quickly run into two essential concepts: persistence and streaming. These features are the bedrock of creating robust, production-ready applications that can handle long-running tasks, maintain context, and provide a transparent user experience.
Persistence is the ability to save the state of your agent at any point in time. This allows you to pause and resume conversations, giving your agent a form of memory. It’s crucial for multi-turn dialogues and for enabling more advanced features like human-in-the-loop interventions.
Streaming provides a real-time view into your agent’s operations. Instead of waiting for a final answer, you can see the intermediate steps — like which tools are being called — or even the final answer as it’s being generated, token by token.
In this hands-on tutorial, which is the fifth part of the Hands-on LangGraph course, we’ll take a basic LangGraph agent and enhance it with these powerful capabilities.
Table of Contents:
Setting Up the Agent
Adding Persistence with Checkpointers
Streaming Intermediate Steps
Streaming Tokens in Real Time
This article is the Fifth Article in the ongoing series of Building LLM Agents with LangGraph:
Persistence and Streaming in LangGraph (You are here!)
Human in the Loop in LLM Agents (Coming Soon!)
Putting it All Together! Building Essay Writer Agent (Coming Soon!)
This series is designed to take readers from foundational knowledge to advanced practices in building LLM agents with LangGraph.
Each article delves into essential components, such as constructing simple ReAct agents from scratch, leveraging LangGraph’s building units, utilizing agentic search tools, implementing persistence and streaming capabilities, integrating human-in-the-loop interactions, and culminating in the creation of a fully functional essay-writing agent.
By the end of this series, you will have a comprehensive understanding of LangGraph, practical skills to design and deploy LLM agents, and the confidence to build customized AI-driven workflows tailored to diverse applications.
Get All My Books, One Button Away With 40% Off
I have created a bundle for my books and roadmaps, so you can buy everything with just one button and for 40% less than the original price. The bundle features 8 eBooks, including:
1. Setting Up the Agent
First, let’s get our initial agent set up. This will be the same research assistant agent from previous examples, equipped with a search tool. We begin by loading our environment variables and making the necessary imports.
from dotenv import load_dotenv
from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
import operator
from langchain_core.messages import AnyMessage, SystemMessage, HumanMessage, ToolMessage
from langchain_openai import ChatOpenAI
from langchain_community.tools.tavily_search import TavilySearchResults
_ = load_dotenv()
Next, we’ll initialize our search tool and define the structure of our agent’s state. The state will simply contain a list of messages that we’ll append to over time.
tool = TavilySearchResults(max_results=2)
class AgentState(TypedDict):
messages: Annotated[list[AnyMessage], operator.add]
2. Adding Persistence with Checkpointers
To add persistence, LangGraph introduces the concept of a checkpointer. A checkpointer automatically saves a snapshot of the graph’s state after every step (i.e., after each node is executed). This allows the agent to pick up right where it left off in a future interaction.
For this guide, we’ll use SqliteSaver, a straightforward checkpointer that uses a SQLite database. By passing “:memory:”, we create a temporary in-memory database that is perfect for testing. For production, you could easily point this to a file-based database or use more robust checkpointers for Postgres or Redis.
from langgraph.checkpoint.sqlite import SqliteSaver
memory = SqliteSaver.from_conn_string(":memory:")
Now, we need to modify our Agent class to accept this checkpointer when the graph is compiled. We’ll add a checkpointer parameter to the __init__ method and pass it to graph.compile().
class Agent:
def __init__(self, model, tools, checkpointer, system=""):
self.system = system
graph = StateGraph(AgentState)
graph.add_node("llm", self.call_openai)
graph.add_node("action", self.take_action)
graph.add_conditional_edges("llm", self.exists_action, {True: "action", False: END})
graph.add_edge("action", "llm")
graph.set_entry_point("llm")
# Pass the checkpointer to the compile method
self.graph = graph.compile(checkpointer=checkpointer)
self.tools = {t.name: t for t in tools}
self.model = model.bind_tools(tools)
def call_openai(self, state: AgentState):
messages = state['messages']
if self.system:
messages = [SystemMessage(content=self.system)] + messages
message = self.model.invoke(messages)
return {'messages': [message]}
def exists_action(self, state: AgentState):
result = state['messages'][-1]
return len(result.tool_calls) > 0
def take_action(self, state: AgentState):
tool_calls = state['messages'][-1].tool_calls
results = []
for t in tool_calls:
print(f"Calling: {t}")
result = self.tools[t['name']].invoke(t['args'])
results.append(ToolMessage(tool_call_id=t['id'], name=t['name'], content=str(result)))
print("Back to the model!")
return {'messages': results}
With our modified class, we can now instantiate our agent, passing in the memory object we created.
prompt = """You are a smart research assistant. Use the search engine to look up information. \
You are allowed to make multiple calls (either together or in sequence). \
Only look up information when you are sure of what you want. \
If you need to look up some information before asking a follow up question, you are allowed to do that!
"""
model = ChatOpenAI(model="gpt-4o")
abot = Agent(model, [tool], system=prompt, checkpointer=memory)
3. Streaming Intermediate Steps
With our checkpointer in place, let’s see how to manage conversational history. To do this, we need to introduce the concept of a thread. A thread is a unique identifier for a conversation, allowing the checkpointer to manage multiple conversations independently.
Keep reading with a 7-day free trial
Subscribe to To Data & Beyond to keep reading this post and get 7 days of free access to the full post archives.