Coqui, the XTTS maintainer, has shut down. XTTS may not receive future updates
or support.
Overview
XTTSTTSService
provides multilingual voice synthesis with voice cloning capabilities through a locally hosted streaming server. The service supports real-time streaming and custom voice training using Coqui’s XTTS-v2 model for cross-lingual text-to-speech.
XTTS API Reference
Pipecat’s API methods for XTTS integration
Example Implementation
Complete example with voice cloning
XTTS Repository
Official XTTS streaming server repository
Voice Cloning
Learn about custom voice training
Installation
XTTS requires a running streaming server. Start the server using Docker:Prerequisites
XTTS Server Setup
Before using XTTSTTSService, you need:- Docker Environment: Set up Docker with GPU support for optimal performance
- XTTS Server: Run the XTTS streaming server container
- Voice Models: Configure voice models and cloning samples as needed
Required Configuration
- Server URL: Configure the XTTS server endpoint (default:
http://localhost:8000
) - Voice Selection: Set up voice models or voice cloning samples
GPU acceleration is recommended for optimal performance. The server requires
CUDA support for best results.