Should I use OpenAI Agents SDK or LangGraph?

It depends on workflow complexity and model preferences. OpenAI Agents SDK is the faster path for linear handoff chains (triage, routing, sequential pipelines) and teams committed to OpenAI models. LangGraph is the better choice for branching workflows, human-in-the-loop approvals, durable execution, and multi-provider setups. If your workflow has fewer than 8 agents in a mostly sequential pattern, start with the OpenAI SDK. If you need conditional routing, cycles, or checkpointed state, start with LangGraph.

Does OpenAI Agents SDK support MCP?

Yes. The SDK has production-grade MCP support with five transport options: Hosted MCP (tool execution delegated to OpenAI's API infrastructure), Streamable HTTP, SSE (deprecated), stdio for local subprocess servers, and an MCP Server Manager for orchestrating multiple servers. Features include tool filtering, approval policies, per-call metadata injection, and automatic tracing of MCP interactions.

Can LangGraph work with OpenAI models?

Yes. LangGraph is model-agnostic and supports OpenAI models through LangChain's provider integrations. You can use GPT-4o, o1, or any other OpenAI model as the LLM backing individual nodes in your graph. You can also mix providers within the same graph, using different models for different nodes based on their strengths.

Which is better for production multi-agent systems?

LangGraph has more production mileage, with roughly 400 companies (including Klarna, Uber, and LinkedIn) running it in production. Its checkpointing, durable execution, and LangSmith observability provide a clear production path. The OpenAI Agents SDK is production-viable and simpler to deploy for straightforward handoff patterns, but its ecosystem for production tooling (monitoring, persistence, debugging) is less mature.

Can I use both frameworks together?

Yes. A practical hybrid approach uses the OpenAI SDK for individual agent definitions (leveraging its MCP support and guardrails) while using LangGraph to orchestrate the overall workflow graph. Each OpenAI SDK agent becomes a node in the LangGraph state graph, giving you tight model integration at the agent level and flexible state-based routing at the orchestration level.

What is LangGraph checkpointing and why does it matter?

LangGraph saves a snapshot of the full graph state after every node executes. These checkpoints persist to a database (PostgreSQL, MongoDB, or Redis in production). This enables four capabilities: resuming interrupted workflows from the last successful step, time-travel debugging by inspecting any historical state, human-in-the-loop workflows where execution pauses and resumes on human input, and fault-tolerant execution where crashed agents replay from their last checkpoint instead of restarting.

OpenAI Agents SDK vs LangGraph Compared (2026)

Two Architectures for Multi-Agent Orchestration

Most comparisons of agent frameworks bury the architectural difference under feature checklists. The real question is simpler: how does work flow between agents?

OpenAI Agents SDK uses handoffs. An agent finishes its piece of a task and explicitly transfers control to another agent, passing along the conversation history. The receiving agent picks up where the previous one stopped. Think of it like a relay race where each runner hands the baton to the next.

LangGraph uses state graphs. You define a directed graph where nodes are functions or agents, edges define transitions (including conditional branches), and a typed state object flows through the entire graph. Think of it like a flowchart that executes itself, with the state accumulating data at each step.

This distinction shapes everything else: how you debug, how you scale, what happens when something fails, and how tightly you're coupled to a specific model provider.

Here's a minimal example of each approach. In the OpenAI SDK, you define agents and wire handoffs between them:

from agents import Agent, handoff

billing_agent = Agent(
    name="Billing",
    instructions="Handle billing questions.",
    model="gpt-4o",
)

triage_agent = Agent(
    name="Triage",
    instructions="Route to the right specialist.",
    handoffs=[handoff(agent=billing_agent)],
)

In LangGraph, you build a graph with explicit state and transitions:

from langgraph.graph import StateGraph
from typing import TypedDict

class TicketState(TypedDict):
    query: str
    category: str
    response: str

graph = StateGraph(TicketState)
graph.add_node("classify", classify_ticket)
graph.add_node("billing", handle_billing)
graph.add_conditional_edges(
    "classify",
    route_by_category,
    {"billing": "billing", "support": "support"},
)

Both get the job done for simple cases. The differences emerge as complexity grows.

Handoff Model: How OpenAI Agents SDK Transfers Control

The OpenAI Agents SDK, released in March 2025 as a replacement for the experimental Swarm framework, centers on four primitives: Agents, Tools, Handoffs, and Guardrails. The handoff is the orchestration mechanism.

When an agent decides it needs another agent's help, it emits a handoff tool call. The SDK captures the full conversation history, transfers control to the target agent, and the target agent continues with that context. The framework represents handoffs as tools to the LLM, generating names like transfer_to_billing_agent automatically from agent names.

What you can customize on a handoff:

tool_name_override and tool_description_override for controlling how the LLM sees the handoff option
on_handoff callback that runs during the transfer, receiving the agent context
input_type for model-generated metadata (reason for handoff, priority, summary)
input_filter for transforming conversation history before the receiving agent sees it
is_enabled for dynamic control over whether a handoff is available

The SDK also supports nested handoff history (currently beta), which collapses prior conversation transcripts into summarized messages. This helps keep context manageable when agents chain through multiple handoffs.

Where handoffs work well: Linear pipelines where Agent A triages, Agent B handles billing, Agent C handles support. Customer service tiers, content review chains, sequential approval workflows.

Where handoffs struggle: The pattern becomes unwieldy beyond 8 to 10 agent types. Every agent needs to know which other agents it can hand off to, creating a web of explicit connections. There's no built-in mechanism for conditional routing based on accumulated state, parallel execution, or cycles where an agent might need to revisit a previous step.

State Graph Model: How LangGraph Routes Through Nodes

LangGraph, built by LangChain, models agent workflows as directed graphs with typed state. You define a StateGraph with a schema (usually a TypedDict or Pydantic model), add nodes that read and write to that state, and connect them with edges. Conditional edges let you branch based on the current state values.

The graph executes step by step. At each node, the function receives the current state, does its work (calling an LLM, querying a database, running a tool), and returns updated state. The framework handles routing to the next node based on the edges you defined.

What makes LangGraph's state model distinctive:

Reducer-based updates. When multiple nodes update the same state key concurrently, reducers define how those updates merge. This is critical for parallel multi-agent systems where two agents might write to the same field.
Checkpointing at every step. LangGraph saves a snapshot of the full state after each node executes. You can inspect any historical state, replay from a checkpoint, or fork execution from a past point.
First-class interrupt() primitive. Any node can pause execution and wait for external input. The state persists to storage, and execution resumes when new input arrives. This is the foundation for human-in-the-loop workflows.

For production, LangGraph supports PostgreSQL, MongoDB, and Redis as checkpoint backends. The AsyncPostgresSaver handles connection pooling and async execution for high-throughput deployments. Companies like Klarna, Uber, and LinkedIn run LangGraph agents at production scale.

Where state graphs work well: Branching workflows, human-in-the-loop approvals, long-running processes that need durability, systems requiring audit trails of every state transition, and complex multi-agent topologies where agents need to revisit earlier steps.

Where state graphs add friction: The learning curve is steep. Defining schemas, edges, conditional routing, and reducers requires more upfront design than wiring handoffs. For a simple three-agent pipeline, LangGraph's graph definition is significantly more code than the equivalent OpenAI SDK setup. One common production pitfall: storing large binary artifacts in state causes checkpoint bloat, since every step creates a new snapshot.

Give your agents a shared workspace that humans can actually use

Fastio workspaces connect to both OpenAI Agents SDK and LangGraph via MCP. generous storage, included credits per month, no credit card required.

Start 14-Day Trial

Side-by-Side Comparison

Here is how the two frameworks compare across the dimensions that matter most in production:

Orchestration model

OpenAI Agents SDK uses sequential handoffs. An agent transfers control to exactly one other agent, carrying conversation context. LangGraph uses directed state graphs where nodes can branch, loop, and execute in parallel based on conditional edges.

Model lock-in

The OpenAI SDK is designed for OpenAI models. While a LiteLLM extension (beta) lets you use other providers like Anthropic, Gemini, or Bedrock, the adapters add a compatibility layer and some features may not work identically across providers. LangGraph is model-agnostic through LangChain's provider ecosystem, supporting any LLM with a LangChain integration out of the box.

Persistence and state

OpenAI Agents SDK uses conversation history as its primary state mechanism, with context variables for application-level data. There is no built-in durable persistence or checkpointing. LangGraph checkpoints state at every node, supporting PostgreSQL, MongoDB, and Redis backends. You can pause, inspect, resume, and time-travel through execution history.

Human-in-the-loop

OpenAI Agents SDK supports input and output guardrails that can intercept and validate agent actions, but lacks a first-class interrupt-and-resume mechanism. LangGraph's interrupt() primitive pauses execution at any node, persists state, and resumes when human input arrives, with approve, edit, reject, and respond actions built in.

MCP support

The OpenAI SDK has production-grade MCP integration with five transport options: Hosted MCP (delegated to OpenAI's API), Streamable HTTP, SSE (deprecated), stdio, and a multi-server manager. Tool filtering, approval policies, and per-call metadata injection are all supported. LangGraph does not have native MCP support but integrates with MCP servers through LangChain's tool abstractions.

Observability

The OpenAI SDK includes built-in tracing with span and trace recording. LangGraph integrates with LangSmith for monitoring and offers LangGraph Studio, a visual debugger that lets you step through graph execution, inspect state at each node, and replay runs.

Maturity and adoption

LangGraph has over 90 million combined monthly downloads (with LangChain) and is used in production by roughly 400 companies. The OpenAI Agents SDK is newer with a smaller production deployment community, though it benefits from OpenAI's ecosystem reach and tight model integration.

Best for

OpenAI Agents SDK fits sequential handoff patterns: support tiers, content pipelines, triage routing. LangGraph fits complex topologies: branching workflows, cyclical processes, regulated industries requiring audit trails and human approvals.

Comparison of agent framework capabilities for production multi-agent systems

Production Trade-offs and Decision Framework

Choosing between these frameworks is less about features and more about the shape of your workflow and the constraints you operate under.

Pick OpenAI Agents SDK when:

Your workflow is mostly linear. Agent A triages, Agent B handles the request, Agent C reviews. Handoffs map directly to this pattern with minimal code.
You're already committed to OpenAI models. The SDK's tight integration with GPT-4o and newer models means you get the best performance and feature coverage without adapter layers.
You need MCP integration. The SDK's five transport options and hosted MCP support make it the more turnkey choice for connecting agents to external tools via MCP.
Speed to production matters more than architectural flexibility. A working multi-agent system in the OpenAI SDK can ship in an afternoon. The same system in LangGraph takes longer to design but gives you more room to grow.
Your agent count stays under 8 to 10. Beyond that threshold, the handoff wiring becomes difficult to maintain and reason about.

Pick LangGraph when:

Your workflow branches, loops, or requires parallel execution. State graphs handle these patterns natively. Handoffs do not.
You need durable execution. Checkpointing means your agents survive restarts, and you can replay from any point. For regulated industries (finance, healthcare, legal), this is often a hard requirement.
Model flexibility matters. If you want to use Claude for reasoning, GPT-4o for function calling, and a local model for sensitive data, LangGraph's provider-agnostic design avoids adapter friction.
Human-in-the-loop is core to your workflow. LangGraph's interrupt primitive with persistent state was purpose-built for approval workflows, content review gates, and escalation paths.
You need observability into state transitions. LangGraph Studio lets you visualize the graph, step through execution, and inspect state at every node. This matters for debugging complex multi-agent interactions.

The hybrid option. These frameworks are not mutually exclusive. Some teams use the OpenAI SDK for individual agent logic (taking advantage of tight model integration and MCP support) while using LangGraph to orchestrate the overall workflow graph. An OpenAI SDK agent can be a node inside a LangGraph state graph.

Where Fastio Fits in Multi-Agent Architectures

Regardless of which framework you choose, multi-agent systems share a common infrastructure need: agents produce files, accumulate state, and eventually need to hand their output to humans. The orchestration framework handles agent-to-agent communication, but you still need somewhere for artifacts to live.

Local filesystems work for prototypes. S3 or Google Cloud Storage work for batch processing. But when agents collaborate on shared deliverables, or when a human needs to review and approve agent output before it ships, you need something that handles permissions, versioning, and discovery.

Fastio provides shared workspaces where agents and humans operate on the same files. Agents access workspaces through the Fastio MCP server, which exposes 19 consolidated tools for file operations, workspace management, AI queries, and workflow actions over Streamable HTTP (/mcp) or legacy SSE (/sse).

How this maps to agent frameworks:

OpenAI Agents SDK agents can use the Fastio MCP server directly, since the SDK supports hosted and Streamable HTTP MCP transports natively. An agent writes its output to a Fastio workspace, and the next agent in the handoff chain reads from the same workspace.
LangGraph agents connect to Fastio through LangChain's tool abstractions or by calling the MCP endpoints directly within graph nodes. Checkpoint state stays in LangGraph's persistence layer; file artifacts go to Fastio where they're versioned and searchable.

Intelligence Mode auto-indexes uploaded files for semantic search and AI-powered Q&A with citations, so agents (and humans) can query workspace contents without building a separate vector database. Metadata Views let you extract structured data from documents, turning PDFs and images into queryable rows that agents can process programmatically.

When the agent pipeline finishes, ownership transfer lets you hand the workspace to a client or stakeholder. The agent retains admin access for updates, but the human controls the deliverable.

The Business Trial includes 50 GB of storage, included credits per month, and 5 workspaces with no credit card and no expiration.

OpenAI Agents SDK vs LangGraph: Handoffs vs State Graphs for Multi-Agent Systems

Two Architectures for Multi-Agent Orchestration

Handoff Model: How OpenAI Agents SDK Transfers Control

State Graph Model: How LangGraph Routes Through Nodes

Give your agents a shared workspace that humans can actually use

Side-by-Side Comparison

Production Trade-offs and Decision Framework

Where Fastio Fits in Multi-Agent Architectures

Frequently Asked Questions

Related Resources

Give your agents a shared workspace that humans can actually use