Skip to main content

Text to Speech Conversion

POST 

/v1/audio/speech

Convert text to speech using the specified model and voice settings.

Request

Body

required

Request parameters for converting text to speech.

    provider stringrequired

    Name of the audio provider, default to 'openai'.

    model stringrequired

    Model identifier for the speech synthesis.

    input stringrequired

    Text input to be converted into speech.

    voice stringrequired

    Voice setting for the speech synthesis. Query /v1/models?voices for complete list.

    voiceSettings object

    Additional settings for voice modulation.

    stability integer

    Possible values: <= 1

    Stability of the voice modulation, 0-1.

    similarity_boost integer

    Possible values: <= 1

    Boost the similarity of the voice modulation, 0-1.

    style integer

    Possible values: <= 1

    Style of the voice modulation, 0-1.

    use_speaker_boost boolean

    Whether to use speaker boost.

    responseFormat string

    The audio format of the response (e.g., mp3, opus) defaults mp3 (openai).

    speed float

    Possible values: >= 0.25 and <= 4

    Speed of the speech playback (openai).

    stream boolean

    Whether to stream the response.

Responses

The audio file or stream returned successfully. Additional usage details are provided in the 'X-Audio-Details' header.

Response Headers
  • X-Audio-Details string

    Provides metadata about the audio request, such as provider, route, model, cost, and latency.

Schema
    audioContent binary

    The binary content of the audio file. Additionally, check the headers for usage details. Example: 'mp3 file', 'header: X-Audio-Details: {"provider":"openai","route":"tts-1-hd","model":"tts-1-hd","voice":"alloy","promptChar":20,"cost":0.00063,"latencyMs":1299}'

    stream object

    Stream object if streaming is requested.

Loading...