docs/04-configuration/PHOENIX_TTS_API_CONTRACT.md

# Phoenix TTS API contract (ElevenLabs-compatible)

**Last Updated:** 2026-02-10  
**Purpose:** So virtual-banker (and other apps) can “just change endpoint” from ElevenLabs to a Phoenix-hosted TTS service.

---

## Required endpoints

The Phoenix TTS service **must** implement the same HTTP contract as ElevenLabs for these paths (base path is the app’s `/tts` or similar; below uses prefix `/v1`).

### 1. Sync text-to-speech

- **Method:** `POST`
- **Path:** `/v1/text-to-speech/:voice_id`
- **Headers:**
  - `Content-Type: application/json`
  - `Accept: audio/mpeg`
  - Auth: either `xi-api-key: <key>` or `Authorization: Bearer <token>` (configurable in client)
- **Body (JSON):**
  ```json
  {
    "text": "Hello world",
    "model_id": "eleven_multilingual_v2",
    "voice_settings": {
      "stability": 0.5,
      "similarity_boost": 0.75,
      "style": 0,
      "use_speaker_boost": true
    }
  }
  ```
- **Response:** `200 OK`, body = raw **mp3** bytes (`audio/mpeg`).

### 2. Streaming text-to-speech

- **Method:** `POST`
- **Path:** `/v1/text-to-speech/:voice_id/stream`
- **Headers:** Same as sync.
- **Body:** Same JSON as sync.
- **Response:** `200 OK`, body = **streaming** mp3 (same format).

### 3. Health (recommended)

- **Method:** `GET`
- **Path:** `/health` (at same origin as the TTS base URL, e.g. `https://phoenix.example.com/tts/health` if base is `.../tts/v1`)
- **Response:** `200 OK` (body optional; used for readiness).

---

## Optional

- **Auth:** If Phoenix uses a different scheme (e.g. Bearer only), clients set `TTS_AUTH_HEADER_NAME` / `TTS_AUTH_HEADER_VALUE`; no API change.
- **Visemes:** For better lip-sync, a future endpoint could return phoneme/viseme timings; client would call it when available.

---

## Reference

- Virtual-banker TTS client: `virtual-banker/backend/tts` (see `backend/tts/README.md`).
- ElevenLabs TTS API: [Text-to-speech](https://elevenlabs.io/docs/api-reference/text-to-speech), [Stream](https://elevenlabs.io/docs/api-reference/text-to-speech/stream).
-												Docs: Phoenix TTS contract, recommendations (TTS/Gitea/Phoenix), push-all note, Gitea labels for virtual-banker

Co-authored-by: Cursor <cursoragent@cursor.com>

											
										
										
											2026-02-10 16:54:22 -08:00
+								# Phoenix TTS API contract (ElevenLabs-compatible)
 								**Last Updated:** 2026-02-10
 								**Purpose:** So virtual-banker (and other apps) can “just change endpoint” from ElevenLabs to a Phoenix-hosted TTS service.
 								---
 								## Required endpoints
 								The Phoenix TTS service **must** implement the same HTTP contract as ElevenLabs for these paths (base path is the app’s `/tts` or similar; below uses prefix `/v1`).
 								### 1. Sync text-to-speech
 								- **Method:** `POST`
 								- **Path:** `/v1/text-to-speech/:voice_id`
 								- **Headers:**
 								  - `Content-Type: application/json`
 								  - `Accept: audio/mpeg`
 								  - Auth: either `xi-api-key: <key>` or `Authorization: Bearer <token>` (configurable in client)
 								- **Body (JSON):**
 								  ```json
 								  {
 								    "text": "Hello world",
 								    "model_id": "eleven_multilingual_v2",
 								    "voice_settings": {
 								      "stability": 0.5,
 								      "similarity_boost": 0.75,
 								      "style": 0,
 								      "use_speaker_boost": true
 								    }
 								  }
 								  ```
 								- **Response:** `200 OK`, body = raw **mp3** bytes (`audio/mpeg`).
 								### 2. Streaming text-to-speech
 								- **Method:** `POST`
 								- **Path:** `/v1/text-to-speech/:voice_id/stream`
 								- **Headers:** Same as sync.
 								- **Body:** Same JSON as sync.
 								- **Response:** `200 OK`, body = **streaming** mp3 (same format).
 								### 3. Health (recommended)
 								- **Method:** `GET`
 								- **Path:** `/health` (at same origin as the TTS base URL, e.g. `https://phoenix.example.com/tts/health` if base is `.../tts/v1`)
 								- **Response:** `200 OK` (body optional; used for readiness).
 								---
 								## Optional
 								- **Auth:** If Phoenix uses a different scheme (e.g. Bearer only), clients set `TTS_AUTH_HEADER_NAME` / `TTS_AUTH_HEADER_VALUE`; no API change.
 								- **Visemes:** For better lip-sync, a future endpoint could return phoneme/viseme timings; client would call it when available.
 								---
 								## Reference
 								- Virtual-banker TTS client: `virtual-banker/backend/tts` (see `backend/tts/README.md`).
 								- ElevenLabs TTS API: [Text-to-speech](https://elevenlabs.io/docs/api-reference/text-to-speech), [Stream](https://elevenlabs.io/docs/api-reference/text-to-speech/stream).