Skip to content

Streaming (Server-Sent Events)

The streaming protocol is the heart of the bridge. It is how a client renders a live assistant response — tokens appearing as they're generated, tool calls surfacing in real time, an auto-generated title arriving, and a final settled message — exactly as the web UI does. It is also how the client learns that the agent is waiting on the user: a tool-approval prompt or a clarifying question.

If you implement one thing well, make it this.

Transport

Standard Server-Sent Events over a long-lived HTTP/1.1 GET:

GET /api/chat/stream?stream_id=<id>
Accept: text/event-stream

Each event on the wire is a normal SSE frame — an event: type line and a data: line carrying a JSON object:

event: token
data: {"text": "Hello"}

event: token
data: {"text": " world"}

event: done
data: {"session_id": "…", "message_id": …, …}

Clients that use a generic EventSource see the type in event.type (or the message default) and parse event.data as JSON. The reference iOS client keys off the SSE event: field; a browser EventSource can addEventListener('token', …) per type.

The turn lifecycle

POST /api/chat/start   ──▶  { "stream_id": "…" }        (1) start a turn
GET  /api/chat/stream?stream_id=…                        (2) open the SSE stream
        ├─ event: token            incremental assistant text  ── append
        ├─ event: reasoning        incremental reasoning/thinking text
        ├─ event: tool             a tool call STARTED
        ├─ event: tool_complete    that tool call FINISHED (result/preview)
        ├─ event: title            the session was auto-titled
        ├─ event: approval  │ initial   agent is WAITING for tool approval  ⇢ respond
        ├─ event: clarify   │ initial   agent is WAITING for an answer       ⇢ respond
        ├─ event: done             the assistant message SETTLED (final text + id)
        └─ event: stream_end       the stream is closing (turn over)

A typical successful turn: a burst of token (and reasoning) frames, possibly interleaved tool / tool_complete pairs, an optional title, then done, then stream_end.

Event reference

Every data payload is a JSON object. Fields below are the ones a client uses.

event: data payload Client action
token { "text": "…" } Append text to the live assistant bubble.
reasoning { "text": "…" } Append to the (collapsible) reasoning/thinking view.
interim_assistant interim assistant snapshot Replace the in-progress bubble with a reconciled interim state (used for live row reconciliation).
tool { "event_type", "name", "preview", "args", "id"/"tool_call_id"/"tool_use_id" } Render a tool-call card as started.
tool_complete same shape + { "duration", "is_error" } Update that card to finished (match on the id/tid).
title { "session_id", "title" } Update the session's title in the sidebar.
done { "session_id", "message_id", + final message fields } Finalize the message: replace streamed text with the settled version, stop the typing indicator.
approval / initial approval-pending payload The agent paused for tool approval — show the prompt, then POST /api/approval/respond.
clarify / initial clarification-pending payload The agent asked a question — show it, then POST /api/clarify/respond.
pending_steer_leftover { "text": "…" } Text the user typed mid-stream that wasn't consumed as a steer — restore it into the composer.
stream_end (none) The stream is over; close the EventSource.
cancel (none) The turn was cancelled (via /api/chat/cancel); settle the UI as cancelled.
error { "error"/"message": "…" } Show the error; stop the typing indicator.

Unknown event: types must be ignored, not treated as errors — the protocol adds frame types over time and old clients must tolerate them. (The reference client returns .ignored for anything it doesn't recognize.)

Reconnecting mid-stream

GET /api/chat/stream?stream_id=… can be re-opened if the connection drops — the server replays/continues the in-flight turn. GET /api/chat/stream/status?stream_id=… reports whether a stream is still live, so a client returning to the foreground can decide whether to re-attach or just reload the session. When a stream is already fully delivered, the server signals already_streamed so the client doesn't double-render.

Interacting with a live turn

  • SteerPOST /api/chat/steer injects a mid-turn nudge into the running agent without cancelling it.
  • CancelPOST /api/chat/cancel (with the stream_id) stops the turn; expect a cancel frame.

The other SSE streams

The same SSE mechanism powers two ambient, session-scoped streams a client keeps open to stay in sync even outside a chat turn:

  • GET /api/approval/stream?session_id=… — pushes tool-approval prompts as they arise.
  • GET /api/clarify/stream?session_id=… — pushes agent clarifying questions.

Both emit an initial frame with the current pending state on subscribe (atomic subscribe-plus-snapshot), then live updates. Respond via POST /api/approval/respond / POST /api/clarify/respond. See Approvals & clarify.

Implementation checklist

  • [ ] Use a client that does not buffer the whole response — stream frames as they arrive (URLSession bytes, EventSource, OkHttp+SSE).
  • [ ] Parse per-event: type; ignore unknown types.
  • [ ] Coalesce token frames into the bubble; swap to the done payload's final text when it arrives.
  • [ ] Match tool_complete to its tool by id.
  • [ ] Handle approval/clarify by opening the corresponding prompt UI and responding.
  • [ ] Tolerate reconnect: check /api/chat/stream/status on foreground.