This guide shows you exactly how to run hermes-router, step by step, no prior server
experience needed. Pick the path that matches your computer and follow it top to bottom.
hermes-router has two pieces. Knowing this makes everything below make sense:
The router (router.py) — the actual program. It’s plain Python, so it runs on
Windows, macOS, and Linux equally well. This is the part that does the work.
The hr command — a friendly helper (hr setup, hr auth add, hr status, …). It’s
written in bash, so it works on Linux/macOS (and Windows via WSL2 or Git Bash), but
not in a plain Windows Command Prompt / PowerShell.
Takeaway: the engine runs everywhere. If you’re on Windows and don’t want bash, you either
use Docker or run the Python program directly — both are covered below.
Add more -e <PROVIDER>_API_KEYS=… for other providers (see providers.md).
Then curl http://localhost:8319/health. To persist keys/state instead of passing env vars,
mount a volume with an auth.json (and set -e ROUTER_STATE_FILE=/tmp/router_state.json).
Windows (PowerShell / Command Prompt): put it on one line. The \ line-continuation
above is Linux/macOS shell syntax — on Windows it errors with docker: invalid reference format and '-e' is not recognized. Use a single line instead:
(PowerShell’s line-continuation character is a backtick `, not \, if you want to split
it across lines.) PROXY_API_KEYS=sk-router-1 here matches the VS Code extension’s default
apiKey, so the extension connects with no extra config — pick your own secret only if you
also set hermesRouter.apiKey to the same value.
In Docker Desktop → Containers, check that the row shows Port(s) = 8319:8319. If
that column is blank, the container has no published port (the -p was dropped) and
nothing on your PC can reach it — remove it and re-run the command above (you can’t add a port
to an existing container):
The VS Code extension talks to the router over HTTP, so it works against
your Docker container. In the extension settings:
hermesRouter.baseUrl → http://localhost:8319
hermesRouter.apiKey → the same value as the container’s PROXY_API_KEYS (sk-router-1
in the command above).
A red 401 unauthorized — check hermesRouter.apiKey in the dashboard means those two
values don’t match — fix apiKey to equal PROXY_API_KEYS and click Refresh.
Want the Add key / Restart / Model buttons to work too? The plain image has no hr inside it,
so use the :cli image variant with a volume, and tell the extension the container name:
Terminal window
docker run -d --name hermes-router -p 8319:8319-v hermes-data:/app/data-e PROXY_API_KEYS=sk-router-1 shafiq735/hermes-router:cli
Then set hermesRouter.dockerContainer to hermes-router. Now the buttons run via
docker exec/docker restart against the container, and the /app/data volume keeps your keys
and settings across restarts. Full details: VS Code Extension → Docker.
(Sticking with the plain image? Add providers by re-running with another -e <PROVIDER>_API_KEYS=…
and “restart” with docker restart hermes-router.)
Step 2 — create an isolated environment and install dependencies:
Terminal window
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
You’ll see (venv) at the start of your prompt — that means it’s active.
Step 3 — set your keys (for this terminal session):
Terminal window
$env:GEMINI_API_KEYS="paste-your-gemini-key"
$env:PROXY_API_KEYS="choose-a-secret-password"
Step 4 — start the router:
Terminal window
python router.py
Leave this window open — it’s now running on http://localhost:8319. Open a second
PowerShell window to test it (see Check it’s working).
Making keys stick: the $env: lines only last for that window. To set them
permanently, use Windows “Edit the system environment variables”, or create a .env file
in the folder (the router reads it on startup):
GEMINI_API_KEYS=paste-your-gemini-key
PROXY_API_KEYS=choose-a-secret-password
You can also manage keys by editing auth.json directly:
Want the router running in the cloud (free) instead of your own computer? A Docker Space
works well. Follow these steps carefully — the port is the #1 thing people get wrong.
Step 1 — create the Space. On huggingface.co → your profile →
New Space, and fill the form like this:
Field on the “Create a new Space” screen
Choose
Owner / Space name
your account / e.g. hermes-router (this becomes your URL)
Short description
optional — e.g. “Free-tier AI load balancer”
License
mit (matches the project)
Select the Space SDK
Docker — ⚠️ not Gradio/Static; it’s a web server, not a Gradio app
Choose a Docker template
Blank — ⚠️ not Streamlit/Shiny/etc. We ship our own Dockerfile, so you want the empty starting point
Space hardware
CPU Basic (Free) — the router is lightweight; no GPU needed
Storage Bucket
leave off (keys go in Secrets, not files — see Step 4)
Space Dev Mode
leave off (PRO-only, not needed)
Visibility
Public — the app URL must be reachable; this is why Step 5 (a strong proxy key) matters
Why Public? A Public Space’s app URL is openly reachable, which is what lets your app
connect to it. Private Spaces require an HF token on every request (awkward for an agent),
so Public + a strong PROXY_API_KEYS is the practical combo. Your secrets are not in the
public repo (they live in Settings → Secrets), so they stay private either way.
Then click Create Space.
Step 2 — add hermes-router’s files (replacing HF’s placeholder). When the Space is
created, Hugging Face shows a “Get started with your Docker Space!” page with an example
FastAPI app on port 7860. Ignore that example — you’re bringing hermes-router
instead. You need exactly three files, all from the
GitHub repo: router.py, requirements.txt,
and Dockerfile.
Pick whichever way you’re comfortable with:
Easiest — upload in the browser (no git, no SSH key needed):
Download the three files from the GitHub repo (open each file → the download/raw
button).
On your Space page, open the Files tab → Add file → Upload files.
Drop in router.py, requirements.txt, and Dockerfile.
Write a short commit message and click Commit changes to main.
Or with git (clone the Space, add the files, push):
# copy router.py, requirements.txt and Dockerfile into this folder, then:
gitaddrouter.pyrequirements.txtDockerfile
gitcommit-m"Add hermes-router"
gitpush
When git asks for a password, paste a write access token from
huggingface.co/settings/tokens (not your account
password). SSH works too, but only after you add an SSH key under
Settings → SSH and GPG Keys — the token route above avoids that.
Both methods keep the Space’s own README.md (the one HF created), which you’ll edit in the
next step. Don’t overwrite it with the GitHub repo’s README — that one has no Space config.
Step 3 — fix the port (don’t skip this!). This is the critical bit. Hugging Face serves
your app on one port, set by app_port (default 7860), but the router listens on
8319. The README that HF auto-generates has no app_port line at all — so you must
add it, or the Space hangs on “Starting” forever. Edit the Space’s own README.md and
make the YAML block at the very top look like this (the new line is app_port: 8319):
---
title: Hermes Router
emoji: 🔀
colorFrom: indigo
colorTo: blue
sdk: docker
app_port: 8319
pinned: false
---
The sdk: docker and app_port: 8319 lines are the ones that matter; the rest is cosmetic.
(Alternatively, leave app_port alone and add a Space variable PORT=7860 so the router
listens where HF expects. Either works — just don’t skip this step, or the page will show
“connection refused” / nothing.)
Step 4 — add your keys as Secrets. A Space’s storage is wiped on every rebuild, so
don’t rely on auth.json. Go to Settings → Variables and secrets and add (as
Secrets):
GEMINI_API_KEYS = your-gemini-key
PROXY_API_KEYS = a-strong-secret-you-choose
ROUTER_STATE_FILE = /tmp/router_state.json
Add more provider keys (OPENROUTER_API_KEYS, etc.) the same way. ROUTER_STATE_FILE points
the ratings cache at /tmp, which is writable on a Space.
Step 5 — 🔒 lock it down. Your Space URL is public — anyone who finds it can spend
your quota. So PROXY_API_KEYSmust be a strong secret you choose, never the default
sk-router-1.
Step 6 — wait for it to build, then open your Space URL:
https://<your-username>-<space-name>.hf.space/health — you should see
{"status":"ok",...}.
Heads-up: free Spaces go to sleep when idle. The first request after a nap takes a
while to wake the Space and may time out — just retry, or have your app retry automatically.
Whichever path you took, verify with these three quick checks.
1. Is it alive?
Terminal window
curlhttp://localhost:8319/health
Expected: {"status":"ok",...}. (No curl? Paste http://localhost:8319/health into a web
browser.)
2. Can it answer a real question? Replace sk-router-1 with your PROXY_API_KEYS value:
Terminal window
curlhttp://localhost:8319/v1/chat/completions\
-H"Authorization: Bearer sk-router-1"\
-H"Content-Type: application/json"\
-d'{"model":"hermes-router","messages":[{"role":"user","content":"Say hi in one word"}]}'
Expected: a JSON reply with the model’s answer inside choices[0].message.content.
3. Point your app at it. Use base URL http://localhost:8319/v1, API key = your
PROXY_API_KEYS, model hermes-router. See usage.md for full examples
(OpenAI SDK, Anthropic SDK, etc.).
This is the #1 issue, and it’s the port. Open the Logs → Container tab: if you see
Serving on http://0.0.0.0:8319, the router is fine — HF just isn’t looking there. The
auto-generated README.md does not include app_port, so HF probes its default 7860
and never sees the app. Fix: add app_port: 8319 to the README metadata block (Step 3) and
commit. It’ll rebuild and flip to “Running”.
Connection refused / page won’t load
Is the router actually running? (Docker: docker compose ps; native: is the python router.py window still open?)
On a Hugging Face Space, this is almost always the port — re-check Step 3 above
(app_port: 8319 or PORT=7860).
Logs say No providers configured (Hugging Face Space)
You haven’t added any keys yet. Go to Settings → Variables and secrets and add your
provider keys + PROXY_API_KEYS as Secrets (Step 4). The only Space setting you need to
touch is “Variables and secrets” — leave hardware, sleep, visibility, etc. at their
defaults.
401 Unauthorized
Your app’s API key doesn’t match PROXY_API_KEYS. Make them the same and try again.
All providers exhausted (503)
No keys loaded, or all of them are rate-limited. Confirm a key is set (hr auth list, or
check your .env/secrets), and add more — see providers.md.
hr: command not found
The hr helper is Linux/macOS/WSL only. On native Windows use Docker or python router.py (Path 3). On Linux/macOS, re-run ./install.sh or open a new terminal so PATH
refreshes.
Port 8319 already in use
Something else is using it. Set a different port: PORT=8320 in .env (and point your app
at the new port), then restart.
Windows: python isn’t recognized
Python isn’t on your PATH. Reinstall from python.org and tick “Add python.exe to PATH”,
or use the Docker path instead.
Still stuck? Run hr doctor (Linux/macOS/WSL) for an automated diagnosis, or check the logs
(docker compose logs -f, or router.log).
On Linux, hr setup can install a systemd service so the router starts on boot and
restarts automatically if it crashes; after that, hr restart manages it. If systemd isn’t
available, hr restart falls back to a background process automatically. On Docker, use
restart: unless-stopped (already set in docker-compose.yml) for the same effect.