Base URL:
https://api.somya.ai. Replace YOUR_API_KEY with a real key — see
Authentication.1. Get an API key
Sign in to the Playground, open API keys, and create one. Copy thekey value — it’s returned only once.
2. Synthesize speech (TTS)
Send text toPOST /v1/speech/synthesize. The response streams as
newline-delimited JSON (application/x-ndjson) — one record per audio chunk —
so you can begin playback before synthesis finishes.
text is required (1–2500 characters). voice and ref_audio are optional —
omitting voice uses the default. To pick a specific voice (and its language),
pass a voice slug from GET /v1/voices. See
Text to Speech for the streaming format and voice details.Authentication header
You can authenticate with either header — they’re equivalent:X-API-Key for server-to-server calls with a SomyaLabs API key.
3. Transcribe audio (ASR)
Send an audio file toPOST /v1/speech/transcriptions as
multipart/form-data under the audio field:
cURL
data.text:
Next steps
API Reference
Every endpoint, generated live from the OpenAPI spec.
Errors
Error envelope, codes, and how to handle non-200 responses.