---
name: fish-audio-tts
description: >
  Use this skill when integrating Fish Audio TTS API for text-to-speech,
  voice cloning, or multilingual speech synthesis.
  Covers REST/WebSocket API via PHP proxy.
---

# Fish Audio TTS Skill

You are an expert at integrating Fish Audio TTS API into web applications using PHP proxy and vanilla JavaScript.

## Overview

Fish Audio provides studio-grade TTS with emotion control:
- **S1 model** — natural, emotionally rich speech
- **Voice cloning** — 15-second audio clip, 99% accuracy
- **8+ languages** — multilingual with emotion markers
- **WebSocket** — real-time streaming
- **REST API** — simple integration
- **Flat-rate pricing** — predictable costs

## Current Models

| Model | Quality | Notes |
|-------|---------|-------|
| `s1` | Studio-grade | Emotion control, multilingual |
| `fish-speech-1.5` | High | Top-ranked multilingual (TTS-Arena2 #1) |

> [!IMPORTANT]
> Fish Audio rebranded from Fish Speech in December 2025.
> Model `s1` is the current flagship.

## API Base

```
https://api.fish.audio/
```

Authentication: `Authorization: Bearer $FISH_AUDIO_API_KEY`

## Key Endpoints

| Endpoint | Method | Purpose |
|----------|--------|---------|
| `/v1/tts` | POST | Generate speech (returns audio) |
| `/v1/models` | GET | List voice models |
| `/v1/models/{id}` | GET | Get model details |
| `/model` | POST | Create voice model (clone) |
| `wss://api.fish.audio/v1/tts/live` | WS | Real-time WebSocket |

## Quick Start (PHP)

```php
<?php
$payload = [
    'text' => $text,
    'reference_id' => 'voice-model-id',  // pre-created voice model
    'format' => 'mp3',
    'mp3_bitrate' => 128
];

$ch = curl_init('https://api.fish.audio/v1/tts');
curl_setopt_array($ch, [
    CURLOPT_POST => true,
    CURLOPT_HTTPHEADER => [
        'Authorization: Bearer ' . getenv('FISH_AUDIO_API_KEY'),
        'Content-Type: application/json'
    ],
    CURLOPT_POSTFIELDS => json_encode($payload),
    CURLOPT_RETURNTRANSFER => true
]);
$audioData = curl_exec($ch);
curl_close($ch);

header('Content-Type: audio/mpeg');
echo $audioData;
```

## Voice Cloning

```php
// Create a voice model from audio clip (min 15s)
$ch = curl_init('https://api.fish.audio/model');
// POST multipart: audio file, title, description
```

## Streaming (WebSocket)

For real-time TTS, use WebSocket endpoint:
```
wss://api.fish.audio/v1/tts/live
```

Control parameters: speed, volume, audio format (opus, mp3, wav).

## Output Formats

- `mp3` (configurable bitrate)
- `opus`
- `wav`
- `pcm`

## Emotion Markers

Fish Audio supports inline emotion markers in text for supported languages:
```
[happy] Hello! [sad] I miss you. [angry] Stop it!
```

## API Docs

- [API Reference](https://docs.fish.audio/api-reference)
- [Text-to-Speech](https://docs.fish.audio/text-to-speech)
- [Voice Models](https://docs.fish.audio/voice-models)
- [WebSocket Streaming](https://docs.fish.audio/websocket)

## Related Skills
- `ai-api` — AI integration patterns (LLM + TTS), PHP proxy architecture
- `tts-voice-instructor` — voice instruction engineering (OpenAI TTS)
