Streaming (Server-Sent Events)¶

The streaming protocol is the heart of the bridge. It is how a client renders a live assistant response — tokens appearing as they're generated, tool calls surfacing in real time, an auto-generated title arriving, and a final settled message — exactly as the web UI does. It is also how the client learns that the agent is waiting on the user: a tool-approval prompt or a clarifying question.

If you implement one thing well, make it this.

Transport¶

Standard Server-Sent Events over a long-lived HTTP/1.1 GET:

GET /api/chat/stream?stream_id=<id>
Accept: text/event-stream

Each event on the wire is a normal SSE frame — an event: type line and a data: line carrying a JSON object:

event: token
data: {"text": "Hello"}

event: token
data: {"text": " world"}

event: done
data: {"session_id": "…", "message_id": …, …}

Clients that use a generic EventSource see the type in event.type (or the message default) and parse event.data as JSON. The reference iOS client keys off the SSE event: field; a browser EventSource can addEventListener('token', …) per type.

The turn lifecycle¶

POST /api/chat/start   ──▶  { "stream_id": "…" }        (1) start a turn
GET  /api/chat/stream?stream_id=…                        (2) open the SSE stream
        │
        ├─ event: token            incremental assistant text  ── append
        ├─ event: reasoning        incremental reasoning/thinking text
        ├─ event: tool             a tool call STARTED
        ├─ event: tool_complete    that tool call FINISHED (result/preview)
        ├─ event: title            the session was auto-titled
        ├─ event: approval  │ initial   agent is WAITING for tool approval  ⇢ respond
        ├─ event: clarify   │ initial   agent is WAITING for an answer       ⇢ respond
        ├─ event: done             the assistant message SETTLED (final text + id)
        └─ event: stream_end       the stream is closing (turn over)

A typical successful turn: a burst of token (and reasoning) frames, possibly interleaved tool / tool_complete pairs, an optional title, then done, then stream_end.

Event reference¶

Every data payload is a JSON object. Fields below are the ones a client uses.

`event:`	`data` payload	Client action
`token`	`{ "text": "…" }`	Append `text` to the live assistant bubble.
`reasoning`	`{ "text": "…" }`	Append to the (collapsible) reasoning/thinking view.
`interim_assistant`	interim assistant snapshot	Replace the in-progress bubble with a reconciled interim state (used for live row reconciliation).
`tool`	`{ "event_type", "name", "preview", "args", "id"/"tool_call_id"/"tool_use_id" }`	Render a tool-call card as started.
`tool_complete`	same shape + `{ "duration", "is_error" }`	Update that card to finished (match on the id/`tid`).
`title`	`{ "session_id", "title" }`	Update the session's title in the sidebar.
`done`	`{ "session_id", "message_id", + final message fields }`	Finalize the message: replace streamed text with the settled version, stop the typing indicator.
`approval` / `initial`	approval-pending payload	The agent paused for tool approval — show the prompt, then `POST /api/approval/respond`.
`clarify` / `initial`	clarification-pending payload	The agent asked a question — show it, then `POST /api/clarify/respond`.
`pending_steer_leftover`	`{ "text": "…" }`	Text the user typed mid-stream that wasn't consumed as a steer — restore it into the composer.
`stream_end`	(none)	The stream is over; close the `EventSource`.
`cancel`	(none)	The turn was cancelled (via `/api/chat/cancel`); settle the UI as cancelled.
`error`	`{ "error"/"message": "…" }`	Show the error; stop the typing indicator.

Unknown event: types must be ignored, not treated as errors — the protocol adds frame types over time and old clients must tolerate them. (The reference client returns .ignored for anything it doesn't recognize.)

Reconnecting mid-stream¶

GET /api/chat/stream?stream_id=… can be re-opened if the connection drops — the server replays/continues the in-flight turn. GET /api/chat/stream/status?stream_id=… reports whether a stream is still live, so a client returning to the foreground can decide whether to re-attach or just reload the session. When a stream is already fully delivered, the server signals already_streamed so the client doesn't double-render.

Interacting with a live turn¶

Steer — POST /api/chat/steer injects a mid-turn nudge into the running agent without cancelling it.
Cancel — POST /api/chat/cancel (with the stream_id) stops the turn; expect a cancel frame.

The other SSE streams¶

The same SSE mechanism powers two ambient, session-scoped streams a client keeps open to stay in sync even outside a chat turn:

GET /api/approval/stream?session_id=… — pushes tool-approval prompts as they arise.
GET /api/clarify/stream?session_id=… — pushes agent clarifying questions.

Both emit an initial frame with the current pending state on subscribe (atomic subscribe-plus-snapshot), then live updates. Respond via POST /api/approval/respond / POST /api/clarify/respond. See Approvals & clarify.

Implementation checklist¶

[ ] Use a client that does not buffer the whole response — stream frames as they arrive (URLSession bytes, EventSource, OkHttp+SSE).
[ ] Parse per-event: type; ignore unknown types.
[ ] Coalesce token frames into the bubble; swap to the done payload's final text when it arrives.
[ ] Match tool_complete to its tool by id.
[ ] Handle approval/clarify by opening the corresponding prompt UI and responding.
[ ] Tolerate reconnect: check /api/chat/stream/status on foreground.