hermes-router

Keep your AI app online for free. One OpenAI/Anthropic-compatible endpoint in front of a pool of free providers — with automatic key rotation, failover, and smart routing.

Get Started View on GitHub

Why hermes-router?

Free AI tiers are generous but rate-limited. hermes-router sits between your app and a pool of free providers (Gemini, OpenRouter, Groq, and more). When one hits its limit, it automatically falls back to the next — so your app keeps working instead of erroring out. Point any OpenAI or Anthropic client at it and nothing else changes.

Never hit a rate limit

Automatic key rotation and provider failover across 16 providers keep you online.

Drop-in compatible

Speaks the OpenAI and Anthropic APIs. Tool calling, embeddings, and streaming all work.

Smart, cheap routing

Each request goes to the cheapest model that can handle it; unhealthy providers are skipped.

Yours to run

A single self-hosted Python file. Your keys live in your own auth.json — nothing hidden.

Start here

Getting Started Zero-experience intro and your first message.

How it works (Architecture) The request pipeline and every moving part.

Build an Agent Chatbot → memory → tools, copy-paste.

Providers Free & paid providers, sign-up links, Codex.

Deployment Run it locally, with the Docker Hub image, or on a free Hugging Face Space.

VS Code Extension Monitor & manage the router, and use it as a model in Copilot Chat.

Quick start

curl -fsSL https://raw.githubusercontent.com/Shaf2665/Hermes-router/main/get.sh | bash
hr setup        # add a key, start the router

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8319/v1", api_key="sk-router-1")
resp = client.chat.completions.create(
    model="hermes-router",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(resp.choices[0].message.content)

On Windows or hosting in the cloud? See Deployment.