FIG-001/ STREAMING

llm streaming.

Simulated token-by-token streaming with markdown rendering and a virtualized viewport that stays pinned as message content grows.

LLMs stream tokens via SSE. As each token arrives, the message content grows and ResizeObserver detects the height change automatically.
Height corrections are batched per animation frame — not per token. The viewport stays pinned to bottom with zero jitter.
Markdown rendering is powered by @humanspeak/svelte-markdown in streaming mode (~1.6ms avg per update) — code blocks, tables, and lists all render live without scroll disruption.
Track token costs across providers with ModelPricing.ai. Need a general-purpose virtual list? Try @humanspeak/svelte-virtual-list.

mode · follow-bottom mode · live running source

file · StreamingChat.svelte progress 0% tokens 0/— speed 40 tok/s ○ IDLE

assistant 07:32 AM

Welcome! Click "Start Streaming" to see a simulated LLM response stream token-by-token with markdown rendering.

speed 40/s

category · streaming

sheet · sheet 01 / 01

⟳ to re-run

LLM Streaming