Conceptos Fundamentales
Completion
The text output generated by a language model in response to a prompt.
A completion is the model's response to your input prompt. In API terms, completions are output tokens — they are generated one token at a time and are typically priced 2–5× higher than input tokens because generation is more compute-intensive than prefill. The length of the completion directly drives your output token cost.
Términos Relacionados
Token
The basic unit of text that language models process and are billed by.
Output Price
The per-million-token cost charged for tokens the model generates.
Streaming
Delivering model output token-by-token as it is generated rather than waiting for the full response.
System Prompt
An instruction block sent before the conversation that configures model behavior.