Overview
WhisperSTTService
provides offline speech recognition using OpenAI’s Whisper models running locally. Supports multiple model sizes and hardware acceleration options including CPU, CUDA, and Apple Silicon (MLX) for privacy-focused transcription without external API calls.
Whisper STT API Reference
Pipecat’s API methods for Whisper STT integration
Standard Whisper Example
Complete example with standard Whisper
Whisper Documentation
OpenAI’s Whisper research paper and model details
MLX Whisper Example
Apple Silicon optimized example
Installation
Choose your installation based on your hardware:Standard Whisper (CPU/CUDA)
MLX Whisper (Apple Silicon)
Prerequisites
Local Model Setup
Before using Whisper STT services, you need:- Model Selection: Choose appropriate Whisper model size (tiny, base, small, medium, large)
- Hardware Configuration: Set up CPU, CUDA, or Apple Silicon acceleration
- Storage Space: Ensure sufficient disk space for model downloads
Configuration Options
- Model Size: Balance between accuracy and performance based on your hardware
- Hardware Acceleration: Configure CUDA for NVIDIA GPUs or MLX for Apple Silicon
- Language Support: Whisper supports 99+ languages out of the box
No API keys required - Whisper runs entirely locally for complete privacy.