How to Use Subagent Delegation in Hermes Agent
Hermes Agent's delegate_task tool spawns child AI agents with isolated contexts and restricted toolsets. This guide covers single-task and batch delegation, toolset restrictions, orchestrator hierarchies with configurable spawn depth, file coordination for concurrent agents, and monitoring with the /agents overlay.
How delegate_task Works in Hermes Agent
Nous Research Hermes Agent includes a built-in delegation system that lets a parent agent spawn child agents to handle subtasks. The core mechanism is a single tool called delegate_task, which creates an isolated child AIAgent instance with its own conversation context, terminal session, and restricted toolset.
The critical design decision here is context isolation. Subagents start with a completely fresh conversation. They have zero knowledge of the parent's history, prior tool calls, or accumulated reasoning. The parent must explicitly pass everything the child needs through two parameters:
- goal: What the subagent should accomplish
- context: File paths, error messages, project structure, constraints, and any other information the child needs to do its work
This isolation is intentional. It keeps each child's context window clean and prevents token bloat from cascading through the delegation tree. Only the final summary from each subagent re-enters the parent's context, not the intermediate tool calls or reasoning steps.
Here is what a basic single-task delegation looks like:
delegate_task(
goal="Debug why tests fail in test_foo.py",
context="Error: assertion failure on line 42. Project uses pytest.",
toolsets=["terminal", "file"]
)
The parent blocks until the child finishes. This is synchronous execution inside the parent's current turn, not a background job queue. Interrupting the parent cancels all active children and discards their work. For durable, persistent tasks, Hermes provides the cronjob tool or backgrounded terminal commands instead.
Delegation makes sense when a subtask requires multi-step reasoning, judgment, or exploration. If the work is mechanical (data transformation, file concatenation, API calls with predictable responses), execute_code is cheaper and faster because it skips the LLM reasoning loop entirely.
Restricted Tool Sets for Each Subagent
Every subagent receives a restricted set of tools defined by the toolsets parameter. This is one of the features that distinguishes Hermes delegation from generic multi-agent frameworks: you specify exactly which capabilities each child gets, and blocked tools are enforced at the runtime level.
Three common toolset patterns cover most use cases:
["terminal", "file"]for coding, debugging, and file manipulation["web"]for research tasks that need search and page fetching["file"]for read-only analysis where the child should inspect but not execute anything
Certain tools are blocked for all subagents regardless of configuration:
- delegation (leaf agents cannot spawn further children)
- clarify (subagents cannot ask the user questions)
- memory (subagents cannot read or write persistent memory)
- code_execution (the Python sandbox is reserved for parent agents)
- send_message (subagents cannot use messaging gateways)
The delegation block is the default because subagents run as "leaf" nodes by default. To allow a child to spawn its own workers, you need to explicitly set its role to "orchestrator" and configure the spawn depth, which we will cover in the orchestrator section below.
Toolset restrictions serve two purposes. First, they reduce the attack surface. A research subagent that only has ["web"] cannot modify your filesystem or execute arbitrary commands. Second, they keep costs predictable. A child with fewer tools makes fewer tool calls, which means fewer LLM turns and lower token consumption.
When you combine toolset restrictions with the context isolation described above, each subagent operates in a well-defined sandbox: it knows only what you told it, and it can do only what you allowed.
Parallel Batch Delegation
Single-task delegation is useful, but the real productivity gain comes from batch mode. Instead of spawning one child at a time, you pass an array of task objects to delegate_task, and Hermes runs them concurrently using a thread pool.
delegate_task(tasks=[
{"goal": "Research competitor pricing for Tool A", "toolsets": ["web"]},
{"goal": "Research competitor pricing for Tool B", "toolsets": ["web"]},
{"goal": "Fix lint errors in src/utils.ts", "toolsets": ["terminal", "file"]}
])
By default, Hermes runs up to 3 concurrent subagents per batch. This is controlled by the max_concurrent_children setting, which has a floor of 1 and no hard ceiling. You can raise it through configuration:
delegation:
max_concurrent_children: 5
Or through the environment variable DELEGATION_MAX_CONCURRENT_CHILDREN.
Results come back sorted by task index regardless of completion order, so the first task in your array always maps to the first result. If you submit more tasks than the concurrency limit allows, Hermes returns a tool error rather than silently truncating. You need to either split the batch or raise the limit.
Each child in a batch gets its own terminal session and its own context window. They do not share state with each other during execution. This means two subagents working on different files will not interfere with each other, but two subagents editing the same file could create conflicts.
Hermes addressed this problem in v0.11.0 (April 2026) with a file coordination layer that prevents concurrent sibling subagents from clobbering each other's edits. The coordination mechanism synchronizes filesystem state across concurrent children so that parallel execution remains safe even when agents operate on overlapping files.
Batch delegation works best when tasks are genuinely independent. Research tasks are a natural fit: spin up three web-only children to investigate different topics simultaneously, then synthesize their summaries in the parent. Code tasks work well when each child handles a different file or module. Avoid batching tasks with sequential dependencies, since each child cannot see what the others are doing.
Give Your Hermes Subagents a Persistent Workspace
Fast.io provides cloud workspaces with file locks, auto-indexing, and MCP access so your delegation output survives beyond the session. 50 GB free, no credit card required.
Orchestrator Role and Multi-Level Hierarchies
By default, every subagent is a leaf node. It completes its task and returns a summary. It cannot delegate further. For workflows that require deeper decomposition, Hermes supports an orchestrator role that lets child agents spawn their own workers.
To enable this, set role="orchestrator" on the delegate_task call:
delegate_task(
goal="Survey three deployment approaches and recommend one",
role="orchestrator",
context="We need to deploy a Python API. Consider Docker, serverless, and VM options."
)
An orchestrator child retains the delegation toolset that is normally blocked for leaf agents. It can call delegate_task itself, creating a second level of workers.
The depth of this hierarchy is controlled by max_spawn_depth:
You can set this in configuration:
delegation:
max_spawn_depth: 2
orchestrator_enabled: true
The orchestrator_enabled flag is a global kill switch. Setting it to false forces all children to leaf status regardless of the depth setting or the role parameter in individual delegate_task calls.
Cost scales multiplicatively with nested delegation. At maximum settings (depth 3, concurrency 3), a single delegate_task call could theoretically spawn up to 27 concurrent leaf agents. Each one runs its own LLM reasoning loop, consuming tokens independently. The official docs include a cost warning for this exact scenario.
In practice, two-level hierarchies (depth 2) handle most complex workflows. A parent delegates to an orchestrator that breaks the problem into parallel leaf tasks. Three-level trees are rare and should be reserved for genuinely large decomposition problems where an intermediate planning layer adds clear value.
You can also redirect subagents to cheaper models to control costs:
delegation:
model: "google/gemini-3-flash-preview"
provider: "openrouter"
Omitting these settings causes subagents to inherit the parent's model and provider. For research or analysis tasks where raw reasoning power matters less than speed, pointing children at a faster, cheaper model is a practical way to keep delegation affordable.
Monitoring Subagents with the /agents Overlay
Running parallel subagents without visibility into their progress is a recipe for wasted compute. Hermes provides the /agents overlay (aliased as /tasks) in its terminal UI for real-time monitoring of all active and completed delegation trees.
The overlay shows:
- Live tree view of running and finished subagents grouped by parent
- Per-branch cost and token rollups so you can see which subtask is consuming the most resources
- File-touch summaries showing which files each subagent has read or modified
- Kill and pause controls for individual subagents mid-flight
- Post-hoc review of each subagent's step-by-step turn history
Kill controls are particularly useful when a subagent gets stuck in a loop or starts consuming tokens without making progress. You can terminate a single child without affecting its siblings or the parent. Interrupt propagation works in the other direction too: killing a parent cascades to all its active children, including nested orchestrator trees.
The post-hoc review feature lets you inspect exactly what each subagent did after it finishes. This is valuable for debugging unexpected results. If a research subagent returned a summary that seems wrong, you can step through its search queries, page fetches, and reasoning to find where it went off track.
For teams running delegation-heavy workflows, the /agents overlay is the primary tool for understanding where time and money go. A common pattern is to run a batch delegation, watch the overlay for stragglers, kill any child that takes more than twice as long as its siblings, and re-run that specific subtask with more explicit context.
Storing Subagent Output in a Persistent Workspace
Hermes Agent delegation runs in-memory. Subagent summaries exist in the parent's context window for the duration of the session, but nothing persists automatically after the conversation ends. If a subagent researches competitor pricing or generates a report, that output disappears when the parent session closes.
For workflows where delegation results need to outlive the session, you need persistent storage. Local filesystems work for single-machine setups, but they break down when agents run on remote servers, containers, or scheduled cron jobs where the filesystem resets between runs.
Fast.io provides persistent cloud workspaces that agents can write to via API or MCP server. The workflow looks like this: a parent agent delegates research to three parallel subagents, each writes findings to a shared workspace, and a human reviewer accesses the same workspace through the web UI to review and approve the output.
Fast.io's MCP server exposes Streamable HTTP at /mcp and legacy SSE at /sse. Hermes Agent can connect to it as an MCP endpoint, giving subagents access to workspace operations without custom API integration code.
For multi-agent workflows specifically, two Fast.io features matter:
File locks prevent concurrent access conflicts. If two subagents need to append to the same output file, they can acquire and release locks through the API to avoid overwriting each other's contributions. This complements Hermes's built-in file coordination layer by extending conflict prevention to cloud storage.
Intelligence Mode auto-indexes uploaded files for semantic search and RAG chat. When subagents write research findings to a workspace, those documents become searchable and queryable immediately. A human reviewer can ask questions about the collected research without reading every file individually.
The free agent plan includes 50 GB of storage, 5,000 monthly credits, and 5 workspaces with no credit card required. For a Hermes delegation workflow that generates a few documents per run, the free tier covers months of operation before hitting any limits.
Other storage options include S3-compatible buckets (good for raw file dumps but no built-in intelligence layer), Google Drive (familiar UI but no MCP integration), and local NAS storage (fast but not accessible from remote or containerized agents).
Frequently Asked Questions
How do I delegate tasks to subagents in Hermes Agent?
Use the delegate_task tool with a goal, context, and toolsets parameter. The goal describes what the subagent should accomplish. The context field passes all necessary information since subagents start with zero knowledge of the parent conversation. The toolsets parameter restricts which tools the child can use, such as ["terminal", "file"] for coding or ["web"] for research.
Can Hermes Agent run multiple agents in parallel?
Yes. Pass a tasks array to delegate_task instead of a single goal. Hermes runs up to 3 concurrent subagents by default using a thread pool. You can increase this limit by setting max_concurrent_children in the delegation configuration or through the DELEGATION_MAX_CONCURRENT_CHILDREN environment variable.
What toolsets can subagents access in Hermes Agent?
Subagents can access combinations of terminal, file, and web toolsets. Common patterns are ["terminal", "file"] for code work, ["web"] for research, and ["file"] for read-only analysis. Delegation, clarify, memory, code_execution, and send_message are always blocked for leaf subagents.
How deep can Hermes Agent delegation hierarchies go?
Up to three levels, controlled by the max_spawn_depth setting. The default is 1 (flat, no nesting). Set it to 2 for orchestrator children that spawn leaf workers, or 3 for three-level trees. The orchestrator_enabled flag can globally disable nested delegation regardless of depth settings.
Does delegate_task run in the background?
No. delegate_task runs synchronously inside the parent's current turn. It blocks the parent until every child finishes. Interrupting the parent cancels all active children and discards their work. For durable background tasks, use the cronjob tool or backgrounded terminal commands instead.
How do I prevent file conflicts between parallel subagents?
Hermes Agent v0.11.0 introduced a file coordination layer that synchronizes filesystem state across concurrent sibling subagents. This prevents children from clobbering each other's edits during parallel execution. For additional protection when writing to cloud storage, Fast.io provides file locks that agents can acquire and release through the API.
Related Resources
Give Your Hermes Subagents a Persistent Workspace
Fast.io provides cloud workspaces with file locks, auto-indexing, and MCP access so your delegation output survives beyond the session. 50 GB free, no credit card required.