Audio
Request schema for Text-to-Speech generation
Text to convert to speech
The quick brown fox jumps over the lazy dog.TTS model. Available: 'sesame/csm-1b' (default)
sesame/csm-1bPreset voice name. Use GET /v2/audio/tts/presets for available voices.
AliceCustom voice cloning config (cannot use with preset_voice)
Routing mode: 'auto' (intelligent), 'opengpu' (blockchain), 'direct' (low-latency)
autoAsync mode: returns task_address immediately, poll /v2/tasks/{task_address} for result. Default: false (sync mode).
falseSuccessful Response
Task accepted (async mode). Poll the poll_url for status.
Validation Error
No content
Request schema for Automatic Speech Recognition (Whisper).
Supports transcription (original language) and translation (to English).
URL to audio file (WAV, MP3, M4A, WEBM, FLAC supported)
https://example.com/audio.wavASR model. Available: 'openai/whisper-large-v3' (default)
openai/whisper-large-v3Task: 'transcribe' (original language) or 'translate' (to English)
transcribeLanguage hint (ISO 639-1 code). Auto-detected if not specified.
enReturn timestamps: False (none), True (segment-level), 'word' (word-level)
falseSampling temperature (0.0 = deterministic)
Process audio in chunks of this length in seconds (default: 30)
Batch size for processing chunks (default: 16)
Routing mode: 'auto' or 'direct' (ASR is direct-only)
autoReturn task_address immediately, poll /v2/tasks/{task_address} for result. Default: false (synchronous).
falseSuccessful Response
Task accepted (async mode). Poll the poll_url for status.
Validation Error
Last updated