Never hit a rate limit
Automatic key rotation and provider failover across 16 providers keep you online.
Free AI tiers are generous but rate-limited. hermes-router sits between your app and a pool of free providers (Gemini, OpenRouter, Groq, and more). When one hits its limit, it automatically falls back to the next — so your app keeps working instead of erroring out. Point any OpenAI or Anthropic client at it and nothing else changes.
Never hit a rate limit
Automatic key rotation and provider failover across 16 providers keep you online.
Drop-in compatible
Speaks the OpenAI and Anthropic APIs. Tool calling, embeddings, and streaming all work.
Smart, cheap routing
Each request goes to the cheapest model that can handle it; unhealthy providers are skipped.
Yours to run
A single self-hosted Python file. Your keys live in your own auth.json — nothing hidden.
curl -fsSL https://raw.githubusercontent.com/Shaf2665/Hermes-router/main/get.sh | bashhr setup # add a key, start the routerfrom openai import OpenAI
client = OpenAI(base_url="http://localhost:8319/v1", api_key="sk-router-1")resp = client.chat.completions.create( model="hermes-router", messages=[{"role": "user", "content": "Hello!"}],)print(resp.choices[0].message.content)On Windows or hosting in the cloud? See Deployment.