Streaming UI from LLM Tokens
Render UI components incrementally as LLM tokens arrive — no spinners, no full-page reloads.
Core Pattern
Open a ReadableStream from your LLM endpoint. Parse the SSE chunks on the client, accumulate partial JSON, and feed it into a state machine that decides when a component boundary is complete enough to render.
Token → Component Pipeline
- fetch the streaming endpoint with Accept: text/event-stream
- Pipe the response body through a TextDecoderStream
- Split on data: prefixes and parse each chunk as a partial JSON fragment
- Accumulate fragments into a structured state object — cards, rows, headings, code blocks
- Re-render the component tree on each meaningful state transition
State Machine Design
Define a finite set of UI states: idle, streaming, partial, complete, error. Each incoming token advances the state. Only render a component when its required fields are fully populated — partial cards stay hidden until the LLM emits the closing brace.
Performance Guardrails
- Throttle re-renders with requestAnimationFrame — never more than 60 fps
- Use React.startTransition to keep the UI responsive during heavy token bursts
- Cap the accumulated state buffer at ~50 KB to prevent memory bloat on long streams
Pro tip: Structure your LLM prompts to emit JSON with a predictable schema. Use a lightweight parser that tolerates truncation — the last chunk will almost always be incomplete.