Skip to content

Getting started

You need two things: the base URL of a running Hermes Web UI server, and (if the instance has a password) that password. Everything below is plain HTTP — use curl, fetch, URLSession, OkHttp, or anything that speaks HTTP/1.1.

1. Find the server

A Hermes Web UI server listens on a host you control — e.g. http://127.0.0.1:8787 locally, or https://hermes.example.com behind a reverse proxy. That origin is your API base URL. Every path in this reference is relative to it.

2. Check whether auth is required

curl -s https://your-host/api/auth/status
{ "authenticated": true, "password_required": false }
  • password_required: false → the instance is open; you can call the API directly.
  • password_required: true and authenticated: false → log in first (next step).

/api/auth/status is public — you can always call it to probe an instance.

3. Log in (if needed)

curl -s -c cookies.txt -X POST https://your-host/api/auth/login \
  -H 'Content-Type: application/json' \
  -d '{"password": "your-server-password"}'

On success the server sets a session cookie (hermes_session by default). Send that cookie on every subsequent request. In a native client, use a cookie-storing HTTP session (URLSession and OkHttp do this automatically) — see Authentication.

4. List sessions

curl -s -b cookies.txt https://your-host/api/sessions
{ "sessions": [ { "session_id": "…", "title": "…", "updated_at": "…", "pinned": false },  ] }

5. Start a streaming conversation

This is the core loop. Two steps: start a turn, then open the SSE stream to receive tokens live.

# (a) start a turn — returns a stream id
curl -s -b cookies.txt -X POST https://your-host/api/chat/start \
  -H 'Content-Type: application/json' \
  -d '{"session_id": "…", "message": "Hello!"}'
# → { "stream_id": "…" }

# (b) open the live stream (Server-Sent Events)
curl -N -b cookies.txt "https://your-host/api/chat/stream?stream_id=…"
# → a sequence of `data: {…}` frames: tokens, tool calls, then a final done frame

The full frame format is documented in Streaming (SSE). That's the whole model — a native client renders those frames into the live assistant bubble exactly like the web UI does.

Minimal client, end to end

const BASE = "https://your-host";
const opts = { credentials: "include", headers: { "Content-Type": "application/json" } };

// login (skip if password_required is false)
await fetch(`${BASE}/api/auth/login`, { ...opts, method: "POST",
  body: JSON.stringify({ password: PW }) });

// list sessions
const { sessions } = await (await fetch(`${BASE}/api/sessions`, opts)).json();

// start a turn
const { stream_id } = await (await fetch(`${BASE}/api/chat/start`, { ...opts, method: "POST",
  body: JSON.stringify({ session_id: sessions[0].session_id, message: "Hi" }) })).json();

// stream the reply
const es = new EventSource(`${BASE}/api/chat/stream?stream_id=${stream_id}`, { withCredentials: true });
es.onmessage = (e) => { const f = JSON.parse(e.data); /* append f.token / handle f.type */ };

Next: read Conventions for the shared rules (session IDs, error shapes, headers), then jump to the endpoint reference for the surface you're building.