AI & Agents

How to Use Cloudflare Durable Objects for AI Agents

Cloudflare Durable Objects provide globally distributed, stateful compute instances that AI agents can use for session management, conversation state, and coordination.

Fast.io Editorial Team 12 min read
Cloudflare Durable Objects architecture for AI agents

What Are Durable Objects for AI Agents?

Durable Objects are Cloudflare's solution for stateful compute at the edge. Unlike traditional serverless functions that are stateless, each Durable Object combines compute and storage in a single instance with a globally unique identifier. For AI agents, this means you can maintain conversation history, coordinate between multiple clients, and persist state without managing separate databases. A Durable Object is a Worker with built-in SQLite storage and guaranteed single-threaded execution. When you send a request to a specific Durable Object by name, Cloudflare routes that request to the exact instance anywhere in the world. This makes them valuable for AI agents that need to maintain context across multiple interactions. Cloudflare Workers AI supports over 20 open-source models that can run alongside Durable Objects, letting you build complete AI agent systems on Cloudflare's edge network. The combination of stateful storage, global distribution, and integrated inference creates a solid foundation for agent development.

Why Use Durable Objects Instead of Traditional Databases?

AI agents face unique challenges compared to typical web applications. They need to manage long-running conversations, coordinate between multiple tool calls, and maintain session state that persists beyond a single request. Traditional architectures require separate infrastructure for compute (API servers), storage (databases), and state management (Redis or similar). Durable Objects collapse this complexity into a single primitive. Each object instance gets zero-latency SQLite storage colocated with the compute, eliminating network round-trips between your application and database. This matters for AI agents where every millisecond of latency adds up across dozens of LLM calls and tool executions.

Single-threaded consistency is another key advantage. Each Durable Object processes requests serially, which means you never have to worry about race conditions when updating agent state. If two users are interacting with the same AI agent session at the same time, their requests are queued and processed in order. For simple use cases like maintaining a user's chat history, Durable Objects might be overkill. But when you need coordination between multiple clients, real-time presence, or complex state machines, they solve problems that would otherwise require custom infrastructure.

Durable Objects state management architecture

Durable Objects vs External Storage: Decision Framework

The most common question developers ask is when to use Durable Objects versus external storage like S3, Fast.io, or traditional databases. The answer depends on your access patterns and data characteristics.

Use Durable Objects for:

  • Session state: Active conversation history, in-progress tool calls, temporary context
  • Coordination locks: Ensuring only one agent instance processes a task at a time
  • Real-time coordination: Managing presence, collaborative editing, or multi-agent handoffs
  • Small, hot data: Frequently accessed state under 1GB that needs millisecond latency

Use external storage for:

  • Long-term memory: Historical conversations, knowledge bases, training data
  • Large files: Documents, images, audio recordings, model weights
  • Shared state across agents: Knowledge that multiple agents need to access
  • RAG pipelines: Vector embeddings, document indexes, semantic search

Hybrid approach works best for most production systems. Store active session state in Durable Objects, write completed conversations to external storage, and use a service like Fast.io for file operations. Fast.io's 251 MCP tools works alongside agent frameworks and provide persistent storage that outlives individual sessions. Durable Objects have a 1GB storage limit per instance and charge for storage duration. If your agent accumulates conversation history over weeks, that cost adds up. External storage is much cheaper for cold data you rarely access.

Fast.io features

Give Your AI Agents Persistent Storage

Fast.io provides 50GB free storage specifically for AI agents, with 251 MCP tools, built-in RAG, and ownership transfer. Use Durable Objects for session state and Fast.io for files, long-term memory, and team collaboration.

Building Your First AI Agent with Durable Objects

Let's walk through implementing a stateful AI agent using Durable Objects. This example uses Cloudflare's Agents SDK, but the patterns apply to any framework.

Step 1: Define Your Durable Object Class

Create a Durable Object that manages agent state. Each instance represents one agent session:

export class AgentSession {
  constructor(private state: DurableObjectState, private env: Env) {}

async fetch(request: Request) {
    const url = new URL(request.url);

if (url.pathname === '/chat') {
      const message = await request.json();
      return this.handleChat(message);
    }

return new Response('Not found', { status: 404 });
  }

async handleChat(message: any) {
    // Get conversation history from SQLite storage
    const history = await this.state.storage.get('messages') || [];

// Add new message
    history.push({ role: 'user', content: message.text });

// Call LLM (simplified - use your preferred inference API)
    const response = await this.callLLM(history);

// Store updated history
    history.push({ role: 'assistant', content: response });
    await this.state.storage.put('messages', history);

return new Response(JSON.stringify({ reply: response }));
  }
}

Step 2: Route Requests to Named Objects

Each user or session gets a unique Durable Object instance:

export default {
  async fetch(request: Request, env: Env) {
    const url = new URL(request.url);
    const sessionId = url.searchParams.get('session');

// Get or create a Durable Object for this session
    const id = env.AGENT_SESSION.idFromName(sessionId);
    const stub = env.AGENT_SESSION.get(id);

return stub.fetch(request);
  }
};

Step 3: Handle Session Lifecycle

Durable Objects automatically hibernate when idle and wake on new requests. This means you don't pay for long-running processes between user messages. For AI agents waiting on slow LLM responses, this saves significant cost compared to keeping a server running continuously. ```typescript async handleChat(message: any) { const history = await this.state.storage.get('messages') || [];

// Check if session is older than 1 hour const lastActivity = await this.state.storage.get('lastActivity'); if (lastActivity && Date.now() - lastActivity > 3600000) { // Archive to external storage before continuing await this.archiveToExternalStorage(history); await this.state.storage.delete('messages'); }

// Update activity timestamp await this.state.storage.put('lastActivity', Date.now());

// Continue with chat logic... } ```

Storage Limits and Cost Optimization

Durable Objects on the Workers Free plan include 1GB of storage across all instances. This sounds small, but for agent session state it's usually sufficient. A typical conversation with 100 messages at ~500 tokens each is only about 50KB.

Storage is the expensive part. Durable Objects charge $0.20 per GB-month for active storage. If you keep 100MB of session data in memory 24/7, that's published pricing. Compare this to Fast.io's agent tier at 50GB free storage with no time limits, and the cost model becomes clear: use Durable Objects for transient state, external storage for everything else.

Read and write costs are small for typical agent workloads. You pay $1 per million reads and $1 per million writes. Even a chatbot handling 10,000 conversations per day with 50 messages each (500K total writes) only costs $0.50/month in write operations.

Practical optimization strategies:

  • Implement session TTL: Delete objects after 24 hours of inactivity
  • Archive proactively: Move conversations to external storage after N messages
  • Use compression: JSON conversations compress well with gzip (50-70% reduction)
  • Batch writes: Don't write to SQLite on every message; buffer in memory and flush periodically
  • Separate hot and cold paths: Keep only the last 10 messages in Durable Objects, load older context on demand

For agents that need to manage files alongside conversation state, Fast.io's MCP integration provides a complementary storage layer. Store chat history in Durable Objects, reference files by URL in Fast.io, and only load file contents when the agent needs them.

Multi-Agent Coordination Patterns

Durable Objects excel at coordinating multiple agents or users working on shared tasks. The single-threaded execution model makes this simpler than distributed systems with eventual consistency.

Pattern 1: Leader Election

Use a named Durable Object as a coordinator. The first agent to contact it becomes the leader:

class TaskCoordinator {
  async assignWork(agentId: string) {
    const leader = await this.state.storage.get('leader');

if (!leader) {
      await this.state.storage.put('leader', agentId);
      return { role: 'leader', task: 'orchestrate' };
    }

return { role: 'worker', task: 'execute' };
  }
}

Pattern 2: Distributed Lock

Prevent multiple agents from processing the same file simultaneously:

async acquireLock(resourceId: string, agentId: string) {
  const locks = await this.state.storage.get('locks') || {};

if (locks[resourceId]) {
    return { acquired: false, holder: locks[resourceId] };
  }

locks[resourceId] = agentId;
  await this.state.storage.put('locks', locks);
  return { acquired: true };
}

Fast.io provides native file locks for similar use cases, but Durable Objects let you implement custom locking semantics for any resource.

Pattern 3: Agent Presence

Track which agents are currently active in a workspace:

async reportPresence(agentId: string) {
  const presence = await this.state.storage.get('activeAgents') || {};
  presence[agentId] = Date.now();

// Remove stale agents (no heartbeat in 60s)
  const cutoff = Date.now() - 60000;
  for (const [id, timestamp] of Object.entries(presence)) {
    if (timestamp < cutoff) delete presence[id];
  }

await this.state.storage.put('activeAgents', presence);
  return Object.keys(presence);
}

These patterns work because Durable Objects guarantee that only one instance processes requests for a given ID at a time. You don't need distributed locks, consensus algorithms, or complex retry logic.

Multi-agent coordination architecture

Integrating with LLM Inference

Cloudflare Workers AI provides access to over 20 open-source models that run on Cloudflare's network. You can call these directly from Durable Objects without external API requests.

Basic inference call:

async callLLM(messages: Message[]) {
  const response = await this.env.AI.run('@cf/meta/llama-3-8b-instruct', {
    messages: messages,
    max_tokens: 500
  });

return response.response;
}

Streaming responses are important for AI agents. Users expect to see output as it's generated, not after the entire response completes:

async streamChat(messages: Message[]) {
  const stream = await this.env.AI.run('@cf/meta/llama-3-8b-instruct', {
    messages: messages,
    stream: true
  });

return new Response(stream, {
    headers: { 'content-type': 'text/event-stream' }
  });
}

Tool calling requires managing multi-turn conversations. The agent calls your function, you execute it, and return results as a new message:

async handleToolCall(messages: Message[]) {
  const response = await this.env.AI.run('@cf/meta/llama-3-8b-instruct', {
    messages: messages,
    tools: [
      {
        name: 'search_files',
        description: 'Search for files in user workspace',
        parameters: { query: { type: 'string' } }
      }
    ]
  });

if (response.tool_calls) {
    // Execute tool and add result to messages
    const result = await this.executeTools(response.tool_calls);
    messages.push({ role: 'tool', content: result });

// Call LLM again with tool results
    return this.handleToolCall(messages);
  }

return response.response;
}

For file operations, the Fast.io MCP server exposes 251 tools that work with any LLM. Instead of building custom file management tools, integrate the MCP server and your agent gets upload, download, search, and sharing capabilities automatically.

When NOT to Use Durable Objects

Durable Objects are powerful but not a universal solution. Here are scenarios where they're the wrong choice:

1. Batch Processing Large Datasets

If your agent needs to process thousands of documents, don't store them all in a Durable Object. The 1GB storage limit will hit quickly, and you'll pay storage costs for data that doesn't need low-latency access. Use external storage and stream data through your agent.

2. Long-Term Knowledge Bases

Conversation history from six months ago doesn't need to live in a Durable Object. Archive completed sessions to cheaper storage. Load old conversations on demand if a user requests historical context.

3. Cross-Region Replication

Durable Objects are strongly consistent but not replicated. If your object instance is in the US and a request comes from Asia, there's network latency. For globally distributed agents that need local reads, use a database with regional replicas.

4. Analytics and Reporting

Don't use Durable Objects as a data warehouse. If you need to run queries like "how many conversations mentioned pricing last week," export data to a proper analytics database. Durable Objects SQLite is optimized for transactional workloads, not analytical queries.

5. File Storage and Media

Durable Objects can technically store binary data, but they're inefficient for it. A single 50MB video would consume 50MB of your 1GB limit. Use Fast.io or S3 for files, store only metadata and URLs in Durable Objects. The right architecture combines multiple storage layers: Durable Objects for active session state, Fast.io for files and RAG data, and a database for long-term analytics.

Frequently Asked Questions

Can I use Durable Objects for AI agent memory?

Yes, Durable Objects work well for short-term memory like active conversation history. However, for long-term memory spanning weeks or months, use external storage due to the 1GB limit per instance and storage costs of $0.20 per GB-month. A hybrid approach stores recent context in Durable Objects and archives older conversations to services like Fast.io.

How do Durable Objects work with AI agents?

Durable Objects provide stateful compute instances that AI agents use to maintain session state, conversation history, and coordination locks. Each object has a unique identifier, single-threaded execution, and built-in SQLite storage. Agents send requests to named objects to persist state across multiple LLM calls and tool executions without managing separate databases.

What is the storage limit for Durable Objects?

Each Durable Object instance has a 1GB storage limit. The Workers Free plan includes 1GB total storage across all instances. For production use, storage costs $0.20 per GB-month. For AI agents, this typically means storing 5,000-10,000 messages per instance before hitting limits.

Should I use Durable Objects or a database for AI agent state?

Use Durable Objects for hot session state that needs millisecond latency and strong consistency. Use external databases for long-term storage, analytics, and shared state across many agents. Most production systems use both: Durable Objects for active sessions and a database for historical data.

How do I handle session expiration in Durable Objects?

Implement a TTL pattern by storing lastActivity timestamps and checking them on each request. After a threshold like 24 hours, archive the session data to external storage and delete it from the Durable Object. This prevents storage costs from accumulating for abandoned sessions.

Can multiple AI agents share the same Durable Object?

Yes, multiple agents can coordinate through a shared Durable Object acting as a coordinator. Use named objects to create shared resources like task queues, leader election, or presence tracking. The single-threaded execution ensures updates happen in order without race conditions.

How do Durable Objects compare to Redis for agent state?

Durable Objects provide stronger consistency guarantees (single-threaded) and colocated compute-storage, while Redis offers faster raw read performance and richer data structures. For AI agents, Durable Objects simplify architecture by eliminating the need for separate Redis infrastructure, but Redis may perform better for high-throughput, read-heavy workloads.

What happens to Durable Object state when my Worker deploys?

Durable Object state persists across Worker deployments. Storage lives independently of your code. When you deploy new code, existing objects continue running with their current state. New requests use the updated code. This allows zero-downtime upgrades for stateful AI agents.

Related Resources

Fast.io features

Give Your AI Agents Persistent Storage

Fast.io provides 50GB free storage specifically for AI agents, with 251 MCP tools, built-in RAG, and ownership transfer. Use Durable Objects for session state and Fast.io for files, long-term memory, and team collaboration.