Loading learning content…
Loading learning content…
Control creativity vs determinism with temperature, top-p, and other sampling parameters.
Read through the lesson, mark it complete when the concept is clear, then move to the next lesson in the sequence or jump back to the module map.
Temperature controls the randomness of model outputs. At temperature 0, the model picks the most probable token every time — deterministic and consistent. At higher temperatures, it samples from a broader distribution — more creative but less predictable.
Temperature is not a creativity dial — it's a reliability dial. High temperature = high variance.
| Temperature | Use Case |
|---|---|
| 0 | Classification, extraction, factual Q&A, code generation |
| 0.3 | Structured summaries, technical writing |
| 0.7 | General writing, balanced creativity |
| 1.0 | Brainstorming, ideation, poetry |
| 1.5+ | Experimental; outputs can become incoherent |
Default for most production use cases: 0.2–0.5.
Top-p restricts sampling to the smallest set of tokens whose cumulative probability reaches P. At top-p 0.9, the model samples from tokens covering 90% of the probability mass, excluding low-probability tail tokens.
Use top-p and temperature together or choose one — both constrain the sampling space. Most practitioners use temperature and leave top-p at 1.0 unless specifically needed.
Top-k limits the candidate tokens to the K most probable at each step. Less commonly used than temperature/top-p, but useful for very constrained outputs.
For production systems handling real user data, start at temperature 0 and increase only when outputs feel too rigid. A/B test temperature values against your eval criteria before deploying.
Never use high temperatures in systems where accuracy matters more than variety — medical, legal, financial, or security-sensitive applications.