AI & Agents

How to Deploy Hermes Agent in Production With Docker

Hermes Agent runs as a persistent background service inside Docker, exposing an OpenAI-compatible API on port 8642 and an optional web dashboard on port 9119. This guide walks through production-ready Docker Compose configuration, volume management, multi-profile container isolation, resource limits, and connecting containers to external persistent storage for long-running deployments.

Fast.io Editorial Team 9 min read
AI agent workspace dashboard showing shared files and collaboration tools

What Hermes Agent Gateway Mode Does

Nous Research Hermes Agent is an open-source (MIT-licensed) AI agent that supports persistent memory, skills, scheduled automations, and messaging platform integrations. When you run it in gateway mode, the agent starts as a background service that exposes an OpenAI-compatible HTTP API on port 8642. Any client that speaks the OpenAI chat completions format can connect to it, including other agents, custom scripts, and LLM orchestration frameworks.

Gateway mode is the recommended approach for production. Instead of running an interactive CLI session that dies when you close your terminal, the gateway keeps the agent alive across restarts and accepts requests over HTTP. You can pair it with a web dashboard on port 9119 for monitoring sessions, reviewing memory, and chatting through the browser.

The Docker image bundles everything Hermes needs: Python 3, Node.js, Playwright with Chromium, ripgrep, ffmpeg, git, and tini as the init process. The base image is Debian 13.4, and the container runs as a non-root user (UID 10000) by default.

A Minimal Production Docker Setup

A minimal production setup needs three things: the gateway command, a persistent volume for state, and resource limits to prevent runaway memory usage. Here is a complete docker-compose.yml that covers the essentials:

services:
  hermes:
    image: nousresearch/hermes-agent:latest
    container_name: hermes
    restart: unless-stopped
    command: gateway run
    ports:
      - "127.0.0.1:8642:8642"
      - "127.0.0.1:9119:9119"
    volumes:
      - ./hermes-data:/opt/data
    environment:
      - HERMES_DASHBOARD=1
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: "2.0"

A few things to notice. The ports bind to 127.0.0.1 instead of 0.0.0.0, which prevents direct internet exposure. Put a reverse proxy like Caddy or nginx in front to handle TLS and authentication. The restart: unless-stopped policy means Docker will restart the container after crashes or host reboots, but not if you explicitly stop it.

The HERMES_DASHBOARD=1 environment variable enables the web dashboard on port 9119. If you do not need the dashboard, remove that line and the 9119 port mapping. You can also customize the dashboard with HERMES_DASHBOARD_HOST, HERMES_DASHBOARD_PORT, and HERMES_DASHBOARD_TUI (set to 1 for in-browser chat).

Start the service:

docker compose up -d

Check that it is running:

docker logs hermes
docker stats hermes
Dashboard showing AI agent activity and audit trail

Persistent Volumes and State Management

Everything Hermes Agent needs to remember lives in a single directory: /opt/data inside the container. This is the directory you mount as a volume. It contains:

  • .env and config.yaml for configuration
  • SOUL.md for the agent's personality and instructions
  • sessions/ for conversation history
  • memories/ for long-term memory
  • skills/ for installed skills
  • cron/ for scheduled automations
  • hooks/ for event-driven actions
  • logs/ for runtime logs

Because the Docker image itself is stateless, upgrading is straightforward: pull the new image and recreate the container. Your data persists in the mounted volume.

docker compose pull
docker compose up -d

One critical rule: never run two gateway containers against the same data directory. Concurrent writes to the same sessions, memories, and config files will corrupt state. If you need multiple instances, use the multi-profile pattern described below.

For the initial setup, run the setup wizard before starting gateway mode:

docker run -it --rm \
  -v ./hermes-data:/opt/data \
  nousresearch/hermes-agent setup

This creates the .env file where you configure your LLM API keys, messaging platform tokens, and other settings. You only need to run this once per data directory.

Fastio features

Give Your Hermes Agent a Persistent Workspace

Fast.io gives your Docker-deployed Hermes Agent 50 GB of free cloud storage with built-in RAG, file versioning, and human handoff. No credit card required.

Multi-Profile Container Isolation

Running multiple Hermes Agent instances, for example separate work and personal profiles, or isolated agents for different clients, is better handled with separate containers rather than Hermes' built-in profile switching. Each container gets its own data directory, port mapping, and lifecycle.

services:
  hermes-work:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-work
    restart: unless-stopped
    command: gateway run
    ports:
      - "127.0.0.1:8642:8642"
      - "127.0.0.1:9119:9119"
    volumes:
      - ./hermes-work:/opt/data
    environment:
      - HERMES_DASHBOARD=1
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: "2.0"

hermes-personal:
    image: nousresearch/hermes-agent:latest
    container_name: hermes-personal
    restart: unless-stopped
    command: gateway run
    ports:
      - "127.0.0.1:8643:8642"
      - "127.0.0.1:9120:9119"
    volumes:
      - ./hermes-personal:/opt/data
    environment:
      - HERMES_DASHBOARD=1
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    deploy:
      resources:
        limits:
          memory: 4G
          cpus: "2.0"

This pattern gives you complete isolation. Each container has its own memories, skills, sessions, and scheduled automations. You can stop and upgrade one without affecting the other. Different containers can even use different LLM providers, since each has its own .env configuration.

The port mapping shifts by one for the second container: 8643 for the gateway API and 9120 for the dashboard. Your reverse proxy routes to the correct backend based on hostname or path prefix.

Hierarchical organization structure showing isolated workspaces and permissions

Resource Limits and Browser Automation

Hermes Agent's resource requirements depend on what you ask it to do. The official recommendations from Nous Research:

  • Minimum: 1 GB RAM, 1 CPU core, 500 MB disk
  • Recommended: 2 to 4 GB RAM, 2+ CPU cores, 2+ GB disk

Browser automation through Playwright and Chromium is the most memory-hungry feature. If your agent uses browser tools (web scraping, form filling, screenshot capture), you need to increase shared memory:

docker run -d \
  --name hermes \
  --shm-size=1g \
  --memory=4g --cpus=2 \
  -v ./hermes-data:/opt/data \
  -p 127.0.0.1:8642:8642 \
  nousresearch/hermes-agent gateway run

The --shm-size=1g flag is critical. Without it, Chromium will crash with "out of memory" errors because Docker's default shared memory allocation (64 MB) is too small for browser rendering. In Docker Compose, add it under the service definition:

services:
  hermes:
    image: nousresearch/hermes-agent:latest
    shm_size: "1g"
    ### ... rest of config

If your agent does not use browser tools, you can run comfortably with 2 GB RAM. Monitor actual usage with docker stats hermes during typical workloads and adjust limits accordingly.

Connecting to External Persistent Storage

The /opt/data volume handles Hermes Agent's internal state, but production agents often need to read and write files that outlive any single session: reports, generated documents, research archives, or files shared with human teammates. Local Docker volumes work for single-server setups, but they do not help when you need to share files across multiple agent containers, back up outputs to a durable location, or hand files off to humans.

For local-only deployments, you can mount additional host directories into the container. But for team workflows where agents produce files that humans need to review, approve, and distribute, a cloud workspace gives you durability and collaboration without managing your own object storage.

Fast.io works well here because agents can interact with it through the MCP server or REST API. An agent running in Docker uploads files to a Fast.io workspace, and a human teammate reviews them through the web UI. Intelligence Mode auto-indexes uploaded files for semantic search and citation-backed chat, so you can ask questions about agent outputs without downloading them.

The free agent plan includes 50 GB of storage, 5,000 monthly credits, and 5 workspaces, with no credit card or trial expiration. When the agent's work is done, ownership transfer lets you hand the entire workspace to a human while the agent retains admin access.

Other options for external storage include mounting an S3-compatible bucket with s3fs-fuse, using an NFS share for multi-host setups, or pointing agents at Google Drive or Dropbox APIs. Each has tradeoffs around latency, access control, and how easily humans can consume the files. Fast.io's advantage is that the workspace is already designed for mixed human and agent access, with built-in file previews, versioning, audit trails, and granular permissions.

Troubleshooting Common Issues

Container exits immediately after starting. Check the logs with docker logs hermes. The most common cause is a missing or malformed .env file in the data volume. Run the setup wizard (docker run -it --rm -v ./hermes-data:/opt/data nousresearch/hermes-agent setup) to create one.

Permission denied errors. The container runs as a non-root user with UID 10000. If your host volume was created by a different user, fix ownership:

sudo chown -R 10000:10000 ./hermes-data

You can also set custom UID/GID with the HERMES_UID and HERMES_GID environment variables.

Port conflicts. If port 8642 is already in use, either stop the conflicting process or map to a different host port: -p 127.0.0.1:8700:8642.

Browser tools crash with OOM. Add --shm-size=1g to your docker run command or shm_size: "1g" in your Compose file. Chromium needs more shared memory than Docker's 64 MB default.

Gateway stops responding. Restart the container with docker restart hermes. If the problem persists, check disk space on the host and review docker stats hermes for memory pressure. Gateway reconnection issues sometimes require a full container recreation: docker compose down && docker compose up -d.

Upgrading without data loss. Your data lives in the mounted volume, not in the container. Pull the new image and recreate:

docker compose pull
docker compose up -d

The old container is replaced, but /opt/data on the host stays intact. All sessions, memories, skills, and cron jobs carry over.

Frequently Asked Questions

How do I run Hermes Agent in Docker?

Pull the official image with docker pull nousresearch/hermes-agent:latest, run the setup wizard to create your .env configuration, then start the gateway with docker run -d --restart unless-stopped -v ./hermes-data:/opt/data -p 8642:8642 nousresearch/hermes-agent gateway run.

What ports does Hermes Agent use?

Port 8642 exposes the gateway's OpenAI-compatible API and health endpoint. Port 9119 serves the optional web dashboard when HERMES_DASHBOARD=1 is set. Both ports are configurable through environment variables.

How do I deploy Hermes Agent as a production service?

Use Docker Compose with gateway mode, a persistent volume mounted at /opt/data, resource limits (2 to 4 GB RAM, 2+ CPUs), restart unless-stopped policy, and a reverse proxy like Caddy or nginx for TLS. Bind ports to 127.0.0.1 to prevent direct internet exposure.

Can I run multiple Hermes Agent instances in Docker?

Yes. Use separate containers with distinct data volumes and port mappings. For example, one container maps port 8642 and another maps port 8643, each with its own /opt/data volume. Never share a data directory between two running gateway containers.

How much RAM does Hermes Agent need in Docker?

The minimum is 1 GB RAM, but Nous Research recommends 2 to 4 GB for production use. If the agent uses browser automation through Playwright, allocate at least 4 GB RAM and add --shm-size=1g for shared memory.

How do I update Hermes Agent Docker containers?

Run docker compose pull to fetch the latest image, then docker compose up -d to recreate the container. Your data persists in the mounted volume, so sessions, memories, skills, and cron jobs carry over automatically.

Related Resources

Fastio features

Give Your Hermes Agent a Persistent Workspace

Fast.io gives your Docker-deployed Hermes Agent 50 GB of free cloud storage with built-in RAG, file versioning, and human handoff. No credit card required.