> ## Documentation Index
> Fetch the complete documentation index at: https://docs.somya.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Text to Speech (Panini)

> Streaming speech synthesis across India's languages.

**Panini** is SomyaLabs' text-to-speech *model*. It streams natural,
low-latency audio via `POST /v1/speech/synthesize`.

## Request

| Field       | Type   | Required | Notes                                                                                  |
| ----------- | ------ | -------- | -------------------------------------------------------------------------------------- |
| `text`      | string | yes      | 1–2500 characters to synthesize.                                                       |
| `voice`     | string | no       | A voice **slug** from `GET /v1/voices`. Omit to use the default voice.                 |
| `ref_audio` | string | no       | Reference audio for voice cloning / style conditioning. See [below](#reference-audio). |

```json theme={null}
{ "text": "Namaste! Aapka swaagat hai.", "voice": "kiran" }
```

## Supported languages

Panini supports **15 languages**:

| Language | Code  | Language  | Code  | Language | Code |
| -------- | ----- | --------- | ----- | -------- | ---- |
| Assamese | `as`  | Hindi     | `hi`  | Nepali   | `ne` |
| Bengali  | `bn`  | Kannada   | `kn`  | Odia     | `or` |
| Dogri    | `doi` | Maithili  | `mai` | Punjabi  | `pa` |
| English  | `en`  | Malayalam | `ml`  | Tamil    | `ta` |
| Gujarati | `gu`  | Marathi   | `mr`  | Telugu   | `te` |

## Voices

You select a language by choosing a **voice** — `voice` is the request
parameter; there is no separate `language` field. The named voices:

| Voice (slug) | Language  | Gender |
| ------------ | --------- | ------ |
| `omkar`      | Marathi   | Male   |
| `ananya`     | Odia      | Female |
| `sravani`    | Telugu    | Female |
| `arjun`      | Hindi     | Male   |
| `simran`     | Punjabi   | Female |
| `madhuri`    | Marathi   | Female |
| `priya`      | Hindi     | Female |
| `kiran`      | Kannada   | Male   |
| `teja`       | Telugu    | Male   |
| `meera`      | Malayalam | Female |

Omit `voice` (or send `""`) to use the **default voice**.

<Note>
  Pass the **slug** (e.g. `kiran`) as `voice`. `panini` is the **model name**, not
  a voice. For the live, account-specific list call
  `GET /v1/voices` (`curl https://api.somya.ai/v1/voices -H "X-API-Key: YOUR_API_KEY"`).
</Note>

## Streaming response (NDJSON)

The endpoint responds with `application/x-ndjson` — one JSON object per line:

```json theme={null}
{"chunk_b64": "<base64-encoded WAV>", "is_final": false}
{"chunk_b64": "<base64-encoded WAV>", "is_final": false}
{"chunk_b64": "<base64-encoded WAV>", "is_final": true}
```

| Field       | Type    | Meaning                                                                                                |
| ----------- | ------- | ------------------------------------------------------------------------------------------------------ |
| `chunk_b64` | string  | Base64 of a **complete, self-contained WAV** (RIFF header + PCM): 24 kHz, 16-bit, mono.                |
| `is_final`  | boolean | `false` for incremental chunks; the record with `true` carries the **full utterance** as a single WAV. |

**Two ways to consume it:**

* **Stream playback** — decode each `chunk_b64` as it arrives and play the
  chunks back-to-back to start audio almost immediately.
* **Whole file** — ignore partials and keep the `is_final: true` chunk; it's the
  complete WAV, ideal for saving or replay.

```python Python — save the final WAV theme={null}
import base64, json, requests

resp = requests.post(
    "https://api.somya.ai/v1/speech/synthesize",
    headers={"X-API-Key": "YOUR_API_KEY"},
    json={"text": "Namaste!", "voice": "kiran"},
    stream=True,
)
final_wav = None
for line in resp.iter_lines():
    if not line:
        continue
    rec = json.loads(line)
    if rec.get("is_final"):
        final_wav = base64.b64decode(rec["chunk_b64"])
with open("speech.wav", "wb") as f:
    f.write(final_wav)
```

## Reference audio

`ref_audio` conditions synthesis on a reference sample (voice cloning / style
transfer). For reusable custom voices, upload one via `POST /v1/voices` and then
pass its slug as `voice`.

<Note>
  The accepted `ref_audio` encoding (e.g. URL vs. base64) and custom-voice upload
  requirements are environment-specific — check the
  [API Reference](/api-reference/introduction) for the current schema before
  using it in production.
</Note>
