Hermes Agent Security: Container Isolation, Authorization, and Safe Deployment
Hermes Agent ships with a seven-layer security model that covers command approval, user authorization, container hardening, credential filtering, content scanning, session isolation, and input sanitization. This guide walks through each layer, explains the configuration trade-offs, compares security postures across all seven deployment backends, and shows how to pair Hermes with external persistent storage for production deployments.
How the Seven-Layer Defense Model Works
Nous Research Hermes Agent is an open-source (MIT-licensed) AI agent built for persistent, autonomous operation. That autonomy creates a real tension: the agent needs to run shell commands, install packages, and manage files, but those same capabilities can destroy a host system if left unchecked.
Hermes addresses this with seven overlapping security layers, applied in order during every tool call:
- User authorization checks whether the person (or bot) sending a message is allowed to interact at all, using allowlists and a pairing-code system.
- Dangerous command approval intercepts destructive operations like
rm -rforDROP TABLEbefore they execute. - Container isolation runs agent commands inside Docker, Singularity, Modal, or another sandboxed backend so mistakes stay contained.
- MCP credential filtering strips API keys and tokens from environment variables passed to MCP subprocesses.
- Context file scanning detects prompt injection attempts in project files like AGENTS.md or .cursorrules before loading them into the system prompt.
- Cross-session isolation separates data and state between concurrent tasks.
- Input sanitization validates working directories and parameters before execution.
This defense-in-depth approach means no single layer has to be perfect. A command that slips past approval still hits the container boundary. A prompt injection that evades content scanning still faces credential filtering. Each layer catches what the previous one missed.
Command Approval and the Hardline Blocklist
Before Hermes executes any shell command, it checks against a curated set of dangerous patterns. The approval system has three modes, configured via approvals.mode in ~/.hermes/config.yaml:
- Manual (default): Every flagged command pauses and asks the user to approve or deny.
- Smart: An auxiliary LLM assesses risk. Low-risk commands auto-approve, genuinely dangerous ones auto-deny, and uncertain cases escalate to the user.
- Off: Disables all approval checks. Equivalent to running
hermes --yoloor sending/yoloin chat.
The patterns that trigger approval cover the operations most likely to cause irreversible damage:
- Recursive deletion (
rm -r,find -delete,xargs rm) - SQL destructive operations (
DROP TABLE,DELETE FROMwithout a WHERE clause,TRUNCATE) - Permission changes (
chmod 777,chmod o+w, recursivechownto root) - Filesystem formatting (
mkfs,dd if=/dev/zero) - System service manipulation (
systemctl stop,systemctl disable) - Piping untrusted URLs to shell interpreters (
curl | sh,bash <(curl ...)) - Overwrites to sensitive paths (
/etc/,~/.ssh/,~/.hermes/.env)
When a dangerous command is detected in the CLI, the user gets four options: approve once, approve for the session, add to a permanent allowlist, or deny. In messaging gateway mode (Telegram, Discord, Slack), the agent sends the command details to chat and waits for a yes/no reply. If no response arrives within the configured timeout (60 seconds by default), the command is denied. Fail-closed.
Beneath the configurable approval system sits a hardline blocklist that cannot be disabled by any setting, including YOLO mode. These are the "never under any circumstances" operations:
rm -rf /and its variants- Bash fork bombs (
:(){ :|:& };:) mkfs.*on mounted root devicesdd if=/dev/zero of=/dev/sd*
When a hardline-blocked command is attempted, Hermes returns an error message explaining why it was blocked, without executing anything.
Persist Hermes Agent outputs across container rebuilds
Free 50GB workspace with versioned storage, audit trails, and MCP access. No credit card, no trial expiration.
Container Hardening and Backend Selection
Hermes supports seven deployment backends, each with a different isolation posture. The choice of backend determines whether you rely on command approval (software-level checks) or container boundaries (OS-level isolation) as your primary defense.
Container backends (Docker, Singularity, Modal, Daytona, Vercel Sandbox) skip dangerous command checks entirely because the container itself is the security boundary. If the agent runs rm -rf / inside a Docker container, it destroys the container filesystem, not your host.
Docker containers run with aggressive hardening by default:
- All Linux capabilities dropped (
--cap-drop ALL), with onlyDAC_OVERRIDE,CHOWN, andFOWNERre-added for package manager operations - No privilege escalation (
--security-opt no-new-privileges) - PID limit of 256 to prevent fork bombs
- Separate tmpfs mounts for
/tmp(512MB, nosuid),/var/tmp(256MB, noexec), and/run(64MB, noexec)
Resource limits are configurable in ~/.hermes/config.yaml:
terminal:
backend: docker
docker_image: "nikolaik/python-nodejs:python3.11-nodejs20"
container_cpu: 1
container_memory: 5120
container_disk: 51200
container_persistent: true
The container_persistent flag controls whether workspace data survives between sessions. When set to true, Hermes bind-mounts /workspace and /root from ~/.hermes/sandboxes/docker/<task_id>/ on the host. When false, everything runs on tmpfs and disappears on cleanup. For production deployments that need durable output, persistent mode with external storage is the recommended pattern.
For teams running Hermes as a messaging gateway, the Docker backend combined with the SSH backend offers an extra isolation layer. You run the gateway process on one machine (handling Telegram, Discord, or Slack connections) while actual command execution happens on a separate server via SSH:
terminal:
backend: ssh
This separates the messaging attack surface from the execution environment. Even if someone compromises the gateway, the agent's shell access lives on a different machine behind SSH key authentication.
User Authorization and DM Pairing
When Hermes runs as a messaging gateway (accepting commands via Telegram, Discord, Slack, WhatsApp, or Signal), authorization controls who can talk to the bot. The system checks identity through a layered priority chain:
- Per-platform allow-all flag (e.g.,
DISCORD_ALLOW_ALL_USERS=true) - DM pairing approved list
- Platform-specific allowlists
- Global allowlist (
GATEWAY_ALLOWED_USERS) - Global allow-all (
GATEWAY_ALLOW_ALL_USERS=true) - Default: deny
The simplest approach is hardcoding user IDs in ~/.hermes/.env:
TELEGRAM_ALLOWED_USERS=123456789,987654321
DISCORD_ALLOWED_USERS=111222333444555666
SLACK_ALLOWED_USERS=U01ABC123
But for teams where membership changes, the DM pairing system is more practical. When an unknown user messages the bot, they receive an 8-character pairing code. The bot owner approves the code via CLI (hermes pairing approve telegram ABC12DEF), and the user is permanently authorized for that platform.
The pairing system follows OWASP and NIST SP 800-63-4 guidelines:
- Codes use a 32-character unambiguous alphabet (no
0/Oor1/Iconfusion) - Generated with
secrets.choice()for cryptographic randomness - Each code expires after 1 hour
- Rate limited to 1 request per user per 10 minutes
- Maximum 3 pending codes per platform at any time
- 5 failed attempts trigger a 1-hour lockout
- Pairing data stored with
chmod 0600permissions - Codes are never logged to stdout
Revoking access is equally straightforward: hermes pairing revoke telegram 123456789 removes a user, and hermes pairing clear-pending wipes all pending codes.
Credential Filtering and SSRF Protection
Even inside a hardened container, the agent interacts with external services through MCP servers, web requests, and environment variables. Hermes applies credential controls at each boundary.
MCP environment isolation. When Hermes launches an MCP subprocess, only safe system variables pass through: PATH, HOME, USER, LANG, LC_ALL, TERM, SHELL, TMPDIR, and XDG_* prefixed variables. Every API key, token, and secret is stripped. Services that need credentials get them through explicit per-server configuration:
mcp_servers:
github:
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_..."
This prevents a compromised or poorly written MCP server from harvesting credentials meant for other services.
Credential redaction. When tool execution produces error messages, Hermes scrubs them before returning content to the LLM. Patterns matching GitHub PATs (ghp_...), OpenAI-style keys (sk-...), Bearer tokens, and common parameter names (token=, key=, API_KEY=, password=, secret=) are all redacted. The LLM never sees raw credentials in error output.
SSRF protection. Every URL the agent fetches is validated against a blocklist of private and internal addresses before the request fires:
- Private networks (RFC 1918):
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 - Loopback:
127.0.0.0/8and::1 - Link-local:
169.254.0.0/16, including cloud metadata endpoints at169.254.169.254 - CGNAT ranges (RFC 6598):
100.64.0.0/10, covering Tailscale and WireGuard networks - Cloud metadata hostnames:
metadata.google.internal,metadata.goog
DNS failures are treated as blocked (fail-closed), and redirect chains are re-validated at each hop. This prevents the agent from being tricked into hitting internal services or cloud metadata APIs through DNS rebinding or redirect chains.
For legitimate internal network access (home labs, LAN-only Ollama instances), you can set security.allow_private_urls: true in the config. But for any public-facing gateway deployment, leave this off.
Website blocklist. You can also block specific domains from agent access:
security:
website_blocklist:
enabled: true
domains:
- "*.internal.company.com"
- "admin.example.com"
This applies across all URL-capable tools: web search, content extraction, and browser navigation.
Pre-Exec Scanning and Prompt Injection Defense
Two additional security layers protect against attacks that target the LLM itself rather than the host system.
Tirith pre-execution scanning. Hermes integrates Tirith, a content-level scanner that checks commands before execution. Tirith detects homograph URL spoofing (internationalized domain attacks that make malicious URLs look legitimate), pipe-to-interpreter patterns (curl | bash), and terminal injection attacks. The scanner auto-installs from GitHub releases with SHA-256 checksum verification.
When Tirith flags a command, the finding integrates directly with the approval flow. The user sees the severity, a description of the threat, and suggested safer alternatives. The default action for unattended execution is deny.
You can configure Tirith's behavior in ~/.hermes/config.yaml:
security:
tirith_enabled: true
tirith_timeout: 5
tirith_fail_open: true
The tirith_fail_open setting controls what happens when Tirith itself is unavailable (crashed, not installed, timed out). The default (true) lets commands proceed. In high-security environments, set this to false so a missing scanner blocks all execution.
Context file injection protection. Before Hermes loads project context files (AGENTS.md, .cursorrules, SOUL.md) into the system prompt, it scans them for prompt injection patterns:
- Instructions to ignore prior instructions
- Hidden HTML comments containing suspicious keywords
- Attempts to read secrets (
.env,credentials,.netrc) - Credential exfiltration via
curl - Invisible Unicode characters (zero-width spaces, bidirectional text overrides)
Blocked files produce a clear warning instead of silently loading: [BLOCKED: AGENTS.md contained potential prompt injection (prompt_injection). Content not loaded.]
This matters because project context files are often contributed by multiple team members or pulled from external repositories. A single malicious AGENTS.md file could otherwise instruct the agent to exfiltrate API keys or execute destructive commands. Scanning these files before they reach the LLM closes that attack vector.
Storing Security Audit Trails
For teams running Hermes in production, the security logs at ~/.hermes/logs/ capture authorization attempts, command approvals, and blocked operations. But those logs live on the agent's host and can be lost during container restarts or infrastructure changes.
Pairing Hermes with an external workspace platform solves the persistence problem. Fast.io provides persistent, versioned storage that agents can write to through the MCP server or REST API. Security logs, audit trails, and agent outputs persist across sessions and container rebuilds. The workspace's built-in Intelligence indexing means you can search audit logs semantically ("show me all denied commands from last week") rather than grepping through raw text files.
The free agent plan includes 50GB of storage, 5,000 credits per month, and 5 workspaces with no credit card required. For a Hermes deployment that generates security-sensitive output, having an independent storage layer that survives container teardowns is worth the five-minute setup.
How to Secure Hermes Agent in Production
A secure Hermes deployment touches every layer of the defense model. Here is the recommended configuration for a production gateway:
Authorization. Set explicit allowlists for every messaging platform you expose. Never use GATEWAY_ALLOW_ALL_USERS=true in production. Enable DM pairing so new team members can self-onboard without you editing config files, and periodically audit the approved users list with hermes pairing list.
Execution backend. Use terminal.backend: docker for production gateways. Set CPU, memory, and disk limits appropriate for your workload. The defaults (1 CPU, 5GB RAM, 50GB disk) are reasonable starting points, but memory-intensive tasks like browser automation may need more headroom.
Credential hygiene. Store API keys in ~/.hermes/.env and set chmod 600 on the file. Never commit .env files to version control. Use per-server MCP environment configuration instead of global environment variables. Review the command_allowlist in your config periodically to make sure you haven't permanently approved something you shouldn't have.
Working directory. Set the MESSAGING_CWD environment variable to a dedicated directory. This prevents the agent from operating in sensitive locations like your home directory or a production checkout.
Process user. Run the gateway process as a non-root user. The official Docker image already defaults to UID 10000, but if you are using the local or SSH backend, create a dedicated hermes user with limited filesystem access.
External storage. Container filesystems are ephemeral by design. For agent outputs that need to survive container rebuilds, send them to an external workspace. This applies to generated reports, processed files, and anything a human needs to review later. S3, Google Cloud Storage, or a workspace platform like Fast.io all work. Fast.io adds the advantage of built-in file versioning, audit trails, and ownership transfer so you can hand off agent-created workspaces to clients or team members.
Updates. Run hermes update regularly. Security patches for the approval patterns, Tirith scanner, and SSRF blocklists ship through the standard update channel.
Monitoring. Check ~/.hermes/logs/ for unauthorized access attempts, repeated command denials, and pairing code abuse. Set up external log shipping if your team has a centralized logging platform.
Frequently Asked Questions
Is Hermes Agent safe to run?
Hermes Agent includes seven security layers by default: command approval, user authorization, container isolation, MCP credential filtering, context file scanning, cross-session isolation, and input sanitization. For development, the local backend with manual command approval provides reasonable safety. For production, switching to the Docker backend adds OS-level isolation that contains any destructive commands to the container filesystem.
How does Hermes Agent handle security?
Hermes applies defense-in-depth. Before any tool call executes, the system checks user authorization, scans the command against dangerous patterns, optionally runs it through the Tirith content scanner, and then executes it inside the configured backend (local, Docker, SSH, or cloud sandbox). Credentials are filtered from MCP subprocesses and redacted from error messages. Project context files are scanned for prompt injection before loading.
Can Hermes Agent run in a sandbox?
Yes. Hermes supports five sandboxed backends: Docker, Singularity, Modal, Daytona, and Vercel Sandbox. Docker is the most common choice for production. Containers run with all Linux capabilities dropped, no privilege escalation allowed, PID limits to prevent fork bombs, and separate noexec tmpfs mounts. When using any container backend, dangerous command approval checks are skipped because the container boundary provides the isolation.
How do I secure Hermes Agent in production?
Use the Docker backend with explicit user allowlists, never enable allow-all flags, store API keys in a chmod 600 .env file, set MESSAGING_CWD to a dedicated working directory, run as non-root, configure appropriate resource limits, and enable DM pairing instead of hardcoded user IDs. For additional isolation, use the SSH backend to separate the messaging gateway from the execution environment.
What is YOLO mode in Hermes Agent?
YOLO mode disables all dangerous command approval checks for the current session. You can activate it with the --yolo CLI flag, the /yolo slash command, or the HERMES_YOLO_MODE=1 environment variable. Even in YOLO mode, the hardline blocklist remains active, preventing operations like rm -rf / and fork bombs. YOLO mode is recommended only in trusted environments running well-tested automation scripts.
Does Hermes Agent protect against prompt injection?
Hermes scans project context files (AGENTS.md, .cursorrules, SOUL.md) for prompt injection patterns before loading them into the system prompt. It checks for instructions to ignore prior instructions, hidden HTML comments, attempts to read secrets, credential exfiltration commands, and invisible Unicode characters. Blocked files produce a warning instead of loading, and the Tirith scanner adds a second layer of pre-execution content analysis.
Related Resources
Persist Hermes Agent outputs across container rebuilds
Free 50GB workspace with versioned storage, audit trails, and MCP access. No credit card, no trial expiration.