Conceptos Fundamentales

Streaming

Delivering model output token-by-token as it is generated rather than waiting for the full response.

Streaming uses server-sent events (SSE) to push each token to the client as soon as it is produced. This dramatically reduces perceived latency for end users — they see text appearing immediately instead of waiting for the full completion. Streaming does not change token cost but is essential for conversational UIs and real-time applications.

Términos Relacionados