BoltAI Documentation
HomepagePricingChangelogDownload
  • Overview
  • Features
  • License
  • Getting Started
    • Setup
    • Your First Chat
    • AI Command
    • AI Inline
    • Use another AI Service
  • Chat UI
    • Overview
    • Basic Chat
    • Document Analysis
    • Advanced Voice Mode (beta)
    • Image Generation
    • Chat Configuration
    • AI Assistant
    • AI Plugins
    • App Appearance
    • Folder & Sidebar
    • Keyboard Shortcuts
    • Import from ChatGPT
    • Import from Claude.ai
    • Import / Export
    • Database Maintenance
    • Locations
    • Feature Flags
    • Community Icon
  • AI Command
    • Overview
    • Customize an AI Command
    • Alternative Profile
    • AI Command Behaviors
    • Instant Command
    • Bulk Editing Commands
    • FAQs
  • AI Inline
    • Overview
    • Inline Assistant
    • Inline Prompt
    • Inline Whisper
    • Advanced Configurations
  • Plugin
    • Overview
    • MCP Servers
    • Google Search
    • Web Browsing
    • Memory
    • Perplexity Search
    • Kagi Search
    • Brave Search
    • You.com Search
    • AppleScript
    • Shell Access
    • FFmpeg
    • DALL·E
    • Replicate
    • Whisper
    • Whisper (via Groq)
    • WolframAlpha
    • Gemini Code Execution
  • BoltAI on Setapp
    • Setapp Limitation
    • AI Inline on Setapp
    • Troubleshooting
  • BoltAI Mobile
    • Getting Started
    • MCP Servers (mobile)
  • Guides
    • How to create an OpenAI API Key
    • How to setup Web Search Plugin for BoltAI
    • How to set up BoltAI without an OpenAI API Key
    • How to generate Azure OpenAI API key
    • How to use Azure OpenAI API key in BoltAI
    • How to create an OpenRouter API key
    • How to set up a custom OpenAI-compatible Server in BoltAI
    • How to use Mistral AI on macOS with BoltAI
    • How to use Perplexity AI on mac with BoltAI
    • How to use Anthropic Claude on macOS with BoltAI
    • How to use Replicate AI on macOS with BoltAI
    • How to use Jina DeepSearch with BoltAI
    • How to migrate data to another Mac
    • How to back up your database
    • Cloud Sync Workaround
  • Troubleshooting
    • How to fix "This license key has reached the activation limit"
    • How to fix "You exceeded your current quota, please check your plan and billing details"
    • How to fix Accessbility permission
    • How to completely uninstall BoltAI
    • Can't select text in conversation prompt
    • API keys not persisted?
    • Download Previous Versions
  • Company
    • Run by a human
Powered by GitBook
On this page
  • What is Whisper?
  • How to set up the Whisper plugin?
  • How to use the Whisper Plugin?
  • FAQ

Was this helpful?

  1. Plugin

Whisper

PreviousReplicateNextWhisper (via Groq)

Last updated 10 months ago

Was this helpful?

The Whisper plugin lets you transcribe audio files using OpenAI's whisper model.

What is Whisper?

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.

How to set up the Whisper plugin?

Go to Settings > Plugins > Whisper. Select tab Settings then enter your OpenAI API key.

How to use the Whisper Plugin?

To use the Whisper plugin, make sure you've setup the API key. Then:

  1. Start a new chat. Choose an LLM that supports Function Calling (for example GPT-4o)

  2. Enable the Whisper plugin

  3. Drag the audio file to the chat input field (not the chat list) and tell the LLM to transcribe it

The audio file input is limited at max 25 MB. You may want to downsample the file before sending for transcription.

To do it, enable the ffmpeg plugin to downsample the audio file before sending to OpenAI server.

FAQ

  1. Can I use this offline? No. This plugin uses the OpenAI API and requires Internet connection and a paid OpenAI API account.

Which whisper model does it use? The Audio API provides two speech to text endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. [] To use the better large-v3 model, please use the plugin.

What are the limitations? File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm. []

source
Whisper via Groq
source