Models

Sliding window attention

Sliding window attention limits attention to a moving context window.

Quick definition

Sliding window attention limits attention to a moving context window.

It reduces compute for long inputs. In models workflows, sliding window attention often shapes model capability and fit.

Model architecture and scale determine capability. Context length, parameter count, and modality support vary across models.

Model architecture affects capability, context length, and speed.

Attend to the last 4k tokens.

Bigger is not always better. Match the model to the task and evaluate in production.

In BoltAI, this shows up in model selection and configuration.