How to Run OpenClaw Parallel Agents Without Wasting Tokens
Multi-agent coordination delivers up to 81% improvement on parallelizable tasks, but it can degrade performance by 70% when the work is actually sequential. OpenClaw's subagent system gives you the concurrency controls to get the speedup without the waste: maxConcurrent caps global lanes, maxChildrenPerAgent limits per-session fan-out, and isolated vs fork context modes determine how much state each worker inherits.
Parallel Speedups Only Work When the Work Is Parallelizable
Google Research's Finance-Agent benchmark found that multi-agent coordination improves performance by 81% on tasks that can be decomposed into independent subtasks. The same study measured 39 to 70% degradation on strict sequential reasoning, where agents trip over each other's intermediate state. That gap is the entire case for tuning your concurrency settings instead of accepting the defaults.
OpenClaw's subagent system manages this through a non-blocking spawn-and-yield pattern. You dispatch a background worker that runs in its own session. When it finishes, a completion event flows back to the parent, which synthesizes results without polling or blocking. Three knobs control the behavior: a global concurrency ceiling, a per-session fan-out limit, and a maximum nesting depth.
The defaults work for small jobs: 8 global lanes, 5 children per session, 1 level of nesting. But once you run multiple projects or spawn workers that themselves need workers, those defaults start fighting you. The rest of this guide covers when and how to change them.
How to Tune maxConcurrent and maxChildrenPerAgent
Two settings control OpenClaw's parallel execution, and they work at different scopes. Understanding how they interact prevents the common scenario where workers queue even though you think there is headroom.
maxConcurrent sets a global ceiling on parallel subagent executions across your entire system. The default is 8. Three orchestrators each trying to spawn 5 workers means only 8 of those 15 run simultaneously. The rest queue until a slot opens.
maxChildrenPerAgent caps fan-out per individual session. Each agent session can have at most this many active children. The default is 5, with a valid range up to 20. This stops a single orchestrator from monopolizing the entire global pool.
These two limits interact in ways that trip people up. Raising maxChildrenPerAgent to 10 while maxConcurrent stays at 8 means your orchestrator can request 10 children, but only 8 execute at once. Raising maxConcurrent to 16 while maxChildrenPerAgent stays at 5 only helps when multiple parent sessions spawn concurrently.
Both settings are part of the same subagent configuration block, along with maxSpawnDepth (nesting depth, default 1, max 5), runTimeoutSeconds (default 900), and a per-worker model override. The subagents documentation covers the full configuration surface and valid ranges for each parameter.
For most workflows, keep maxConcurrent between 8 and 16. Going higher increases token throughput but also multiplies cost. Anthropic's engineering data shows multi-agent systems consume roughly 15x more tokens than equivalent single-agent chats, so every additional lane carries a direct cost impact.
Choose Between Isolated and Fork Context
When you spawn a subagent, the context parameter determines what the worker knows before it starts.
Isolated (the default) gives the worker a clean transcript. It receives only the task prompt you pass in sessions_spawn, plus the agent's AGENTS.md and TOOLS.md files. No conversation history, no prior results. This is the right choice for independent research tasks, parallel file processing, and any work where the worker does not need to know what other workers have done.
Fork branches the parent's full transcript into the child. The worker sees everything the parent has seen up to that point, including prior tool calls and results. This costs more tokens because the forked context is larger, but it is necessary when the work depends on earlier decisions. If your orchestrator just analyzed a dataset and the next worker needs to reference that analysis, fork avoids re-doing the work.
The official docs put it directly: "Use fork sparingly. It is for context-sensitive delegation, not a replacement for writing a clear task prompt." In practice, isolated workers with detailed task prompts outperform forked workers with vague ones, because the parent agent has already digested the relevant context and can pass along a focused summary rather than dumping the entire transcript.
A practical split: use isolated for leaf tasks (research a company, summarize a document, lint a file) and fork for continuation tasks (refine the analysis we just ran, apply the style guide we agreed on).
Persist parallel agent output in one workspace
50GB free cloud storage with file locks, automatic indexing, and MCP access for OpenClaw agents. No credit card, no trial expiration.
Spawn Workers and Collect Results
The spawn-yield cycle is the core mechanic. Here is the sequence:
Decompose the task. Break the work into independent units. Each unit should be completable without knowing the results of other units. If unit B depends on unit A's output, they are sequential and should not be parallelized.
Call sessions_spawn for each unit. Each call returns immediately with a run ID and child session key. The worker starts executing in the background.
sessions_spawn task="Research competitor pricing for Acme Corp" taskName="research-acme" context="isolated"
sessions_spawn task="Research competitor pricing for Beta Inc" taskName="research-beta" context="isolated"
sessions_spawn task="Research competitor pricing for Gamma Ltd" taskName="research-gamma" context="isolated"
Call sessions_yield. This ends the parent's current model turn and waits for completion events. When a child finishes, its result arrives as the next message in the parent's session. Do not replace sessions_yield with polling loops, shell sleep commands, or repeated calls to the subagents tool. The official docs are explicit: completion events are pushed, not pulled.
Synthesize results. Once child results arrive, the parent agent reviews them and decides what to do next. It might merge the outputs into a single report, spawn additional workers for follow-up tasks, or deliver the final result to the user.
maxSpawnDepth controls how deep this hierarchy can go. At the default depth of 1, subagents cannot spawn their own children. Set it to 2 for an orchestrator pattern: your main agent spawns coordinators, which spawn workers. The recommended ceiling for most use cases is 2. The maximum is 5, but deeper hierarchies increase latency and make debugging harder.
At depth 2, the tool policy changes automatically. Depth-1 agents (orchestrators) receive session management tools like sessions_spawn, subagents, sessions_list, and sessions_history. Depth-2 agents (leaf workers) do not. This prevents unbounded recursive spawning.
Handle Output Convergence Across Workers
The hard part of parallel agents is not spawning them. It is merging their output when they all write to the same workspace.
The problem: three workers research three competitors and each tries to write a summary file to the same directory. Without coordination, the last writer wins, or worse, partial writes corrupt the output.
Pattern 1: Parent-as-merger. Workers return results through the announcement chain instead of writing files directly. The parent agent receives structured output from each child's completion event, merges the data in its own context, and writes the final file once. This is the cleanest approach for tasks under 10 workers.
Pattern 2: File-per-worker. Each worker writes to a unique output path (e.g., research-acme.md, research-beta.md). The parent reads all files after workers complete and produces a merged output. This works well when individual outputs are large and would bloat the parent's context window.
Pattern 3: Shared workspace with file locks. When workers need to append to a shared resource, use a storage layer that supports locking. Fast.io workspaces provide file locks that let agents acquire and release access to prevent write conflicts. An agent locks a file, appends its section, and releases the lock. Other agents queue until the lock is available. Combined with versioning and audit trails, this approach handles convergence for larger teams of parallel agents without data loss.
For file-heavy convergence workflows, a cloud workspace outperforms local filesystems because every worker accesses the same source of truth regardless of where it runs. Local directories work for single-machine setups, S3 handles raw object storage, and Google Drive or Dropbox cover basic file sync. Fast.io adds file locks, automatic indexing through Intelligence Mode, and MCP-native access that lets OpenClaw agents interact with the workspace directly through tool calls. The free agent plan includes 50GB of storage and 5,000 monthly credits with no credit card required. More details at Fast.io for agents and the OpenClaw integration page.
Cost note: parallel workers multiply token usage proportionally. Three workers processing a 10,000-token task each consume 30,000 tokens total, not 10,000. Factor this into your maxConcurrent tuning. Running the orchestrator on a capable model like Claude Opus while using a cheaper model for workers (configurable via the model key in your subagent defaults) is a practical way to keep costs reasonable without sacrificing output quality on the coordination layer.
How to Debug Parallel Agent Failures
Workers start but results never arrive. Check runTimeoutSeconds in your config. The default is 900 seconds (15 minutes). Long-running workers may be timing out silently. Increase the timeout or break the task into smaller units.
"Tool not found" errors in subagents. Subagents do not inherit all tools by default. Session management tools (sessions_spawn, sessions_list, sessions_send) are excluded from depth-2 workers by design. If a worker needs a specific tool, configure it explicitly in the tools.subagents.tools.allow array.
Hitting the concurrency ceiling. If you see workers queuing unexpectedly, check both maxConcurrent and maxChildrenPerAgent. The tighter constraint wins. Use the /subagents list command to see active and queued children for the current session.
Stale sessions accumulating. Subagent sessions auto-archive after archiveAfterMinutes (default 60). If you run many short-lived workers, you may want to reduce this value. Transcripts are retained but moved out of active session storage.
Container permission errors on spawn. When running OpenClaw in a container, verify that the container UID matches the host file ownership for ~/.openclaw/openclaw.json. A UID mismatch causes the gateway to look healthy while spawns fail with EACCES errors. Run ls -la ~/.openclaw/openclaw.json and id to diagnose.
Empty results from workers with thinking enabled. Thinking mode in unattended subagents can produce empty announcement strings. For workers that run without human interaction, consider disabling thinking mode in the subagent configuration and reserving it for the parent orchestrator.
Frequently Asked Questions
How many agents can OpenClaw run in parallel?
The global concurrency cap is controlled by maxConcurrent, which defaults to 8. You can increase it, but each additional lane multiplies token consumption. The per-session cap, maxChildrenPerAgent, defaults to 5 and limits how many children a single orchestrator can spawn simultaneously.
How do I configure concurrent subagents in OpenClaw?
Set maxConcurrent and maxChildrenPerAgent in your agents.defaults.subagents configuration block. maxConcurrent is the system-wide ceiling, while maxChildrenPerAgent limits fan-out per session. Adjust maxSpawnDepth (default 1, max 5) if you need orchestrator agents that spawn their own sub-workers.
What is maxConcurrent in OpenClaw?
maxConcurrent is a global limit on the total number of parallel subagent executions running at any time across your OpenClaw system. It operates as a dedicated queue lane named 'subagent.' When the limit is reached, additional spawn requests queue until a running worker completes.
How do parallel OpenClaw agents share data?
Parallel subagents do not share memory or context automatically. Each worker runs in its own session with an independent context window. Data sharing happens through the parent agent, which receives completion events from each child and synthesizes the results. For file-based convergence, workers can write to separate output paths or use a shared workspace with file locks to prevent write conflicts.
Should I use isolated or fork context for parallel workers?
Use isolated context (the default) for independent tasks where each worker does not need prior conversation history. Use fork context when the worker needs to reference earlier decisions or analysis from the parent session. Fork costs more tokens because it copies the parent's full transcript into the child.
How do I reduce token costs when running parallel agents?
Run the orchestrator on a high-quality model and set a cheaper model for workers using the model key in your subagent defaults. Use isolated context instead of fork to avoid duplicating the parent transcript. Decompose tasks so each worker handles a focused unit rather than re-processing shared context.
Related Resources
Persist parallel agent output in one workspace
50GB free cloud storage with file locks, automatic indexing, and MCP access for OpenClaw agents. No credit card, no trial expiration.