virtual-banker/backend/tts/README.md

# TTS package — ElevenLabs-compatible, Phoenix endpoint swap

This package provides a **text-to-speech client** that matches the [ElevenLabs TTS API](https://elevenlabs.io/docs/api-reference/text-to-speech) contract. You can point it at **ElevenLabs** or at a **Phoenix-hosted** TTS service that implements the same API shape; switching is a config change (base URL), no code change.

**Note:** The repo [eleven-labs/api-service](https://github.com/eleven-labs/api-service) on GitHub is a PHP OpenAPI consumer library, not the voice TTS API. This client targets the **REST TTS API** at `api.elevenlabs.io` (and compatible backends).

---

## Parity with ElevenLabs TTS API

| Feature | ElevenLabs API | This client |
|--------|----------------|-------------|
| **Sync** `POST /v1/text-to-speech/:voice_id` | ✅ | ✅ `Synthesize` |
| **Stream** `POST /v1/text-to-speech/:voice_id/stream` | ✅ | ✅ `SynthesizeStream` |
| **Voice settings** (stability, similarity_boost, style, speaker_boost) | ✅ | ✅ `VoiceConfig` |
| **Model** (`model_id`) | ✅ | ✅ `SetModelID` / default `eleven_multilingual_v2` |
| **Auth** `xi-api-key` header | ✅ | ✅ |
| **Output** `Accept: audio/mpeg` (mp3) | ✅ | ✅ |
| **Retries** (5xx, backoff) | — | ✅ on sync |
| **Visemes** (lip sync) | ❌ (no phoneme API) | ✅ client-side approximation |

Optional ElevenLabs features not used here: `output_format` query, `optimize_streaming_latency`, WebSocket streaming. For “just change endpoint” to Phoenix, the host only needs to implement the same **sync + stream** JSON body and return **audio/mpeg**.

---

## Which TTS backend? (decision table)

| Env / condition | Backend used |
|----------------|--------------|
| `TTS_VOICE_ID` unset (or no auth) | **Mock** (no real synthesis) |
| `TTS_VOICE_ID` + `TTS_API_KEY` or `ELEVENLABS_*` set, `TTS_BASE_URL` unset | **ElevenLabs** (api.elevenlabs.io) |
| `TTS_BASE_URL` set (e.g. Phoenix) + auth + voice | **Phoenix** (or other compatible host) |
| `USE_PHOENIX_TTS=true` | Prefer Phoenix; use `TTS_BASE_URL` or `PHOENIX_TTS_BASE_URL` |

Auth: default header is `xi-api-key` (ElevenLabs). For Phoenix with Bearer token set `TTS_AUTH_HEADER_NAME=Authorization` and `TTS_AUTH_HEADER_VALUE=Bearer <token>`.

---

## Using with Phoenix (swap endpoint)

1. **Phoenix TTS service** must expose the same contract:
   - `POST /v1/text-to-speech/:voice_id` — body: `{"text","model_id","voice_settings"}` → response: raw mp3
   - `POST /v1/text-to-speech/:voice_id/stream` — same body → response: streaming mp3
   - **Health:** `GET /health` at the same origin (e.g. `{baseURL}/../health`) returning 2xx so `tts.Service.Health(ctx)` can be used for readiness.

2. **Configure the app** with the Phoenix base URL (and optional auth):

   ```bash
   export TTS_BASE_URL="https://phoenix.example.com/tts/v1"
   export TTS_VOICE_ID="default-voice-id"
   # Optional: Phoenix uses Bearer token
   export TTS_AUTH_HEADER_NAME="Authorization"
   export TTS_AUTH_HEADER_VALUE="Bearer your-token"
   # Or feature flag to force Phoenix
   export USE_PHOENIX_TTS=true
   export PHOENIX_TTS_BASE_URL="https://phoenix.example.com/tts/v1"
   ```

3. **Health check:** The client’s `Health(ctx)` calls `GET {baseURL}/../health` when base URL is not ElevenLabs. Wire this into your readiness probe or a `/ready` endpoint if you need TTS to be up before accepting traffic.

4. **In code** (e.g. for reuse in another project):

   ```go
   opts := tts.TTSOptions{
       BaseURL:         "https://phoenix.example.com/tts/v1",
       AuthHeaderName:  "Authorization",
       AuthHeaderValue: "Bearer token",
   }
   svc := tts.NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts)
   if err := svc.Health(ctx); err != nil { /* not ready */ }
   audio, err := svc.Synthesize(ctx, "Hello world")
   ```

No code change beyond config: same interface, different base URL and optional auth header.

---

## Reuse across projects

This package lives in **virtual-banker** and can be depended on as a Go module path (e.g. `github.com/your-org/virtual-banker/backend/tts` or via a shared repo). Any project that needs TTS can:

- Depend on this package.
- Use `tts.Service` and either `NewMockTTSService()` or `NewElevenLabsTTSServiceWithOptions(apiKey, voiceID, baseURL)` / `NewElevenLabsTTSServiceWithOptionsFull(apiKey, voiceID, opts)` for custom auth.
- Set `baseURL` to ElevenLabs (`""` or `https://api.elevenlabs.io/v1`) or to the Phoenix TTS base URL.

The **interface** (`Synthesize`, `SynthesizeStream`, `GetVisemes`) stays the same regardless of backend.