RECIPE

Streaming UI from LLM Tokens

Render UI components incrementally as LLM tokens arrive — no spinners, no full-page reloads.

Core Pattern

Open a ReadableStream from your LLM endpoint. Parse the SSE chunks on the client, accumulate partial JSON, and feed it into a state machine that decides when a component boundary is complete enough to render.

Token → Component Pipeline

fetch the streaming endpoint with Accept: text/event-stream
Pipe the response body through a TextDecoderStream
Split on data: prefixes and parse each chunk as a partial JSON fragment
Accumulate fragments into a structured state object — cards, rows, headings, code blocks
Re-render the component tree on each meaningful state transition

State Machine Design

Define a finite set of UI states: idle, streaming, partial, complete, error. Each incoming token advances the state. Only render a component when its required fields are fully populated — partial cards stay hidden until the LLM emits the closing brace.

Performance Guardrails

Throttle re-renders with requestAnimationFrame — never more than 60 fps
Use React.startTransition to keep the UI responsive during heavy token bursts
Cap the accumulated state buffer at ~50 KB to prevent memory bloat on long streams

Pro tip: Structure your LLM prompts to emit JSON with a predictable schema. Use a lightweight parser that tolerates truncation — the last chunk will almost always be incomplete.