logo

LLM Streaming

Examples / LLM Streaming
Progress
0%
Tokens
0 / 0
Speed
40 tok/s

How LLM Streaming Works

  • LLMs stream tokens via SSE. As each token arrives, the message content grows and ResizeObserver detects the height change automatically.
  • Height corrections are batched per animation frame — not per token. The viewport stays pinned to bottom with zero jitter.
  • Markdown rendering powered by @humanspeak/svelte-markdown with streaming mode (~1.6ms avg per update). Code blocks, tables, lists — all rendered live without scroll disruption.
  • 💡 Track token costs across providers with ModelPricing.ai. Need a general-purpose virtual list? Try @humanspeak/svelte-virtual-list.
Assistant 12:45 AM

Welcome! Click "Start Streaming" to see a simulated LLM response stream token-by-token with markdown rendering.