Generation
Top-p
Top-p (nucleus sampling) limits output to the most probable tokens whose total probability is p.
Quick definition
Top-p (nucleus sampling) limits output to the most probable tokens whose total probability is p.
- Category: Generation
- Focus: output style and randomness
- Used in: Lower randomness for precise, repeatable answers.
What it means
It adapts the candidate set based on distribution confidence. In generation workflows, top-p often shapes output style and randomness.
How it works
Generation settings control how the model samples tokens. They trade off creativity, determinism, and safety.
Why it matters
Generation settings trade off creativity, determinism, and safety.
Common use cases
- Lower randomness for precise, repeatable answers.
- Higher randomness for brainstorming and creative tasks.
- Stopping rules to end output at the right time.
Example
Top-p of 0.9 keeps the smallest set of tokens summing to 90%.
Pitfalls and tips
High randomness can reduce accuracy while low randomness can be repetitive. Tune per task and evaluate results.
In BoltAI
In BoltAI, this appears in model settings that shape responses.