Optimize Ollama Models for BoltAI
BoltAI supports both online and local models. If privacy is your primary concern, you can use BoltAI to interact with your local AI models. We recommend using Ollama.
By default, Ollama uses a context window size of 2048 tokens. This is not ideal for more complex tasks such as document analysis or heavy tool uses. Follow this step-by-step guide to modify the context window size in Ollama.
1. Prepare the Modelfile for the new model
In Ollama, a Modelfile serves as a configuration blueprint for creating and sharing models. To modify the context window size for a model on Ollama, we will need to build a new model with the new num_ctx
configuration.
Create a new file Modelfile
with this content:
Here is my Modelfile to build qwen2 128K context window:
2. Create a new model based on the modified Modelfile
Run this command to create the new model
If you've already pulled the model, it should be very fast. You can verify it by running ollama list
. You can see the new model qwen2.5-coder-32k
3. Try the new model in BoltAI
Go back to BoltAI and refresh the model list.
Open BoltAI Settings (
command + ,
)Navigate to Models > Ollama
Click "Refresh"
Start a new chat with this model:
Setting the context window parameter at runtime
There are 2 sets of API in Ollama:
Ollama API and
Ollama allows setting the num_ctx
parameter when using the Ollama API endpoint. Unfortunately, this is not possible when using the OpenAI-compatible API, which BoltAI is using.
In the next version, I will add support for the official Ollama API endpoint but for now, please create a new model with the modified num_ctx
parameter.
And that's it for now 👋
If you are new here, BoltAI is a mac app that allows you to use top AI services and local models easily, all from a single native Mac app.
Last updated