Temperature
Temperature is a parameter used in natural language processing (NLP) and generative AI to control the randomness and creativity of text generation. In large language models such as ChatGPT, GPT-4, and Bard, it influences the probability distribution used when deciding which token (word) to output next. Temperature is typically set in the range of roughly 0.0 to 2.0, and its value changes the model's behavior as follows: • Low temperature (e.g., 0.2–0.5): Selection is strongly biased toward high-probability tokens, making output more deterministic and consistent. → Suited for use cases that prioritize accuracy and stability (translation, summarization, code generation, etc.) • High temperature (e.g., 1.0–1.5): Low-probability tokens are also more likely to be selected, making output more diverse and creative—but potentially less consistent or logical. → Suited for use cases such as brainstorming, story generation, and writing poetry. • Mid-range temperature (e.g., 0.7–1.0): Tends to produce responses that balance accuracy and creativity, and is commonly used as the default setting for many chatbots and generative AI services. Technically, temperature controls the "sharpness" of the probability distribution and is used for scaling in the softmax function: P(word) = softmax(logits / temperature) As this formula shows, raising the temperature flattens the distribution (= more randomness), while lowering it sharpens the distribution (= higher certainty). Used in combination with top-k and top-p sampling, temperature is a very important tuning parameter in generative AI that allows fine-grained adjustment of the diversity and quality of generated text.