Models

Token budget

A token budget is the maximum tokens allowed for prompt and response.

Quick definition

A token budget is the maximum tokens allowed for prompt and response.

Budgets control cost and prevent exceeding model limits. In models workflows, token budget often shapes model capability and fit.

Model architecture and scale determine capability. Context length, parameter count, and modality support vary across models.

Model architecture affects capability, context length, and speed.

Use 2,000 tokens for context and 500 for output.

Bigger is not always better. Match the model to the task and evaluate in production.

In BoltAI, this shows up in model selection and configuration.