Overview

Azure Cognitive Services provides high-quality text-to-speech synthesis with two service implementations: AzureTTSService (WebSocket-based) for real-time streaming with low latency, and AzureHttpTTSService (HTTP-based) for batch synthesis. AzureTTSService is recommended for interactive applications requiring streaming capabilities.

Installation

To use Azure services, install the required dependencies:
pip install "pipecat-ai[azure]"

Prerequisites

Azure Account Setup

Before using Azure TTS services, you need:
  1. Azure Account: Sign up at Azure Portal
  2. Speech Service: Create a Speech resource in your Azure subscription
  3. API Key and Region: Get your subscription key and service region
  4. Voice Selection: Choose from available voices in the Voice Gallery

Required Environment Variables

  • AZURE_SPEECH_API_KEY: Your Azure Speech service API key
  • AZURE_SPEECH_REGION: Your Azure Speech service region (e.g., “eastus”)