Text to Speech Conversion
POST/v1/audio/speech
Convert text to speech using the specified model and voice settings.
Request
- application/json
Body
required
Request parameters for converting text to speech.
Name of the audio provider, default to 'openai'.
Model identifier for the speech synthesis.
Text input to be converted into speech.
Voice setting for the speech synthesis. Query /v1/models?voices for complete list.
voiceSettings object
Additional settings for voice modulation.
Possible values: <= 1
Stability of the voice modulation, 0-1.
Possible values: <= 1
Boost the similarity of the voice modulation, 0-1.
Possible values: <= 1
Style of the voice modulation, 0-1.
Whether to use speaker boost.
The audio format of the response (e.g., mp3, opus) defaults mp3 (openai).
Possible values: >= 0.25
and <= 4
Speed of the speech playback (openai).
Whether to stream the response.
Responses
- 200
- 400
The audio file or stream returned successfully. Additional usage details are provided in the 'X-Audio-Details' header.
Response Headers
X-Audio-Details string
Provides metadata about the audio request, such as provider, route, model, cost, and latency.
- audio/*
- Schema
- Example (from schema)
Schema
The binary content of the audio file. Additionally, check the headers for usage details. Example: 'mp3 file', 'header: X-Audio-Details: {"provider":"openai","route":"tts-1-hd","model":"tts-1-hd","voice":"alloy","promptChar":20,"cost":0.00063,"latencyMs":1299}'
Stream object if streaming is requested.
{
"audioContent": "string",
"stream": {}
}
Invalid request parameters.