Conceptos Fundamentales
Temperature
A sampling parameter that controls the randomness of model outputs.
Temperature scales the probability distribution over the next token before sampling. At temperature=0 the model is deterministic, always choosing the most likely token. Higher values (0.7–1.0) introduce randomness for creative outputs. For deterministic tasks like data extraction or code generation, low temperatures reduce errors and improve consistency.
Términos Relacionados
Top-P (Nucleus Sampling)
A sampling strategy that limits token selection to the smallest set covering a cumulative probability threshold.
Completion
The text output generated by a language model in response to a prompt.
Inference
The process of running a trained model to generate outputs from new inputs.