Reasoning Model
A model variant that produces explicit step-by-step thinking before answering.
Reasoning models (o1, o3, Claude 3.7 with extended thinking, Gemini Thinking) use chain-of-thought internally during inference, often generating thousands of 'thinking tokens' before producing a final answer. This dramatically improves accuracy on math, science, and logic but increases latency and cost. Thinking tokens may be billed at a discount or separately from output tokens.
Verwandte Begriffe
A prompting technique that instructs the model to reason step-by-step before answering.
The process of running a trained model to generate outputs from new inputs.
The time between sending a request and receiving the first token of a response.
Massive Multitask Language Understanding — a benchmark testing knowledge across 57 academic subjects.