Skip to main content

Translation management API

Control Palabra real-time translation pipeline over WebSockets or a WebRTC data channel.


1. Prerequisites

Before you connect, make sure you have the following:

  • webrtc_url – The URL of the Palabra WebRTC server.
  • ws_url – The URL of the Palabra WebSocket server.
  • publisher access token – A JWT used to authorize your connection.

All three values are returned when you create a streaming session.


2. Choose a transport

2.1 Option 1. WebRTC

  1. Connect with any LiveKit client to webrtc_url using your publisher token.
  2. Once the connection is open, you can start sending commands through the default (empty-topic) WebRTC data channel.

2.2 Option 2. WebSockets

Connect to ws_url, passing your publisher token as a query parameter:

// WebSocket control URL
const endpoint = `${ws_url}?token=${publisher}`;
const socket = new WebSocket(endpoint);

Once the connection is open, you can start sending commands.


3. Message format (WebRTC & WebSockets)

Every API packet—request and response has the same envelope:

{
"message_type": "<string>",
"data": { /* payload */ }
}

If message_type is "error", the data field contains diagnostic information.


4. Typical workflow

  1. Create task - send a set_task message type to start the translation.
  2. Update task – Send another set_task message to update the translation settings during an ongoing translation.
  3. Pause processing - send a pause_task message type to pause the translation (stops billing). Resume with another set_task.
  4. Finish task - send an end_task (the server will close your connection automatically; session will be invalidated in 1 minute).

5. Message Settings reference


6. Streaming audio configuration

6.1 Option 1. WebRTC audio I/O configuration

Use the following input/output streams configuration in your set_task:

{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "webrtc"
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "webrtc"
}
}
// ...
}
}
  1. Publish your microphone track to LiveKit Room.
  2. Subscribe to the translation tracks that Palabra will publish in the same LiveKit Room after you send the set_task message.

6.2 Option 2. WebSocket audio I/O configuration

Use this configuration instead:

{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "ws",
"format": "opus", // or pcm_s16le, wav
"sample_rate": 24000, // 16000 - 24000
"channels": 1 // 1 or 2
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "ws",
"format": "pcm_s16le" // or zlib_pcm_s16le
}
}
}
}
  1. Send base-64 audio chunks that exactly match the declared format.
  2. Receive base-64 TTS chunks in output_audio_data responses:
{
"message_type": "output_audio_data",
"data": {
"transcription_id": "190983855fe3404e",
"language": "es",
"last_chunk": false,
"data": "<base64-encoded audio>"
}
}

7. API messages

Request message schema

Loading ....

Response message schema

Loading ....

7.1 Requests (client → server)

Message typeShort description
set_taskCreate/update translation task
end taskFinish translation task
get_taskReturn current task
pause_taskPause current task, use set_task to continue
tts_taskGenerate TTS from text
input_audio_dataInput audio data chunk (Websockets audio transport only)

set_task

Create a new task or modify the current one.

  • Sending for the first time after creating a session - starts the translation.
  • Sending after pause_task - resumes the translation.
  • Sending another set_task message during an ongoing translation updates the current translation settings in real time—no need to stop the translation.
{
"message_type": "set_task",
"data": {
"input_stream": { /* Depending on transport, see the audio I/O section above */ },
"output_stream": { /* Depending on transport, see the audio I/O section above */ },
"pipeline": {
"transcription": {
"source_language": "string",
"detectable_languages": ["string"],
"segment_confirmation_silence_threshold": "float",
"only_confirm_by_silence": "bool",
"sentence_splitter": {
"enabled": "bool"
},
"verification": {
"auto_transcription_correction": "bool",
"transcription_correction_style": "string"
}
},
// Translation and speech generation settings for one or more target languages
"translations": [
{
"target_language": "string",
"translate_partial_transcriptions": "bool",
"speech_generation": {
"voice_cloning": "bool",
"voice_id": "string",
"voice_timbre_detection": {
"enabled": "bool",
"high_timbre_voices": [],
"low_timbre_voices": []
}
}
}
// You can add more targets
],
"translation_queue_configs": {
"global": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int",
"auto_tempo": "bool"
},
"es": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int"
}
},
// Select response types to receive
"allowed_message_types": [
"translated_transcription",
"partial_transcription",
"partial_translated_transcription",
"validated_transcription"
]
}
}
}

Check out the Translation settings breakdown for details on each field and Recommended Settings of the translation's task pipeline..

Need ASR only mode? Do either:

  • set output_stream to null (you will still get text translations, but no TTS audio)
  • use an empty translations list (no translations and no TTS).

end_task

Finish the current task. The server closes the connection after receiving end_task.

{
"message_type": "end_task",
"data": { "force": false } // set true to skip finalization of the last phrase
}

pause_task

Pause the current task. No audio data is processed and no billing while the task is paused. Use set_task to resume translation.

{ "message_type": "pause_task", "data": {} }

get_task

Return the current task.

{ "message_type": "pause_task", "data": {} }

tts_task

Generates TTS from a text. It will be translated to all target_language in translations task section.

{
"message_type": "tts_task",
"data": {
"text": "Hello, how are you?",
"language": "en" // text language
}
}

input_audio_data

Used to send input base64 encoded audio data chunk whe Websockets selected as audio transport. The audio chunks you push must match the format / sample_rate / channels you declare in your set_task command. The optimal chunk length is 320ms.

{
"message_type": "input_audio_data",
"data": {
"data": "base64 encoded data"
}
}

7.2 Responses (server → client)

Message typeShort description
partial_transcriptionUnconfirmed ASR segment
partial_translated_transcriptionUnconfirmed translation segment
validated_transcriptionFinal ASR segment
translated_transcriptionFinal translation
output_audio_dataChunk of generated TTS audio (WebSockets audio transport only)
current_taskget_task command response
errorValidation or runtime error
  • To receive partial_transcription, validated_transcription, and translated_transcription messages, you must include these message types in the allowed_message_types field of your set_task command.

  • To receive partial_translated_transcription messages, you must must include it in the allowed_message_types field AND set translate_partial_transcriptions to true in your set_task command.


partial_transcription

Uncompleted segment transcription:

{
"message_type": "partial_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two"
}
}
}

partial_translated_transcription

Uncompleted segment translation.

{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois,"
}
}
}

validated_transcription

Completed segment transcription.

{
"message_type": "validated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two, three, four, five."
}
}
}

translated_transcription

Completed segment translation:

{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois, três, quatro, cinco."
}
}
}

output_audio_data

TTS audio chunk (if you use Websockets as audio transport).

{
"message_type": "output_audio_data",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es", // TTS language
"last_chunk": false, // Last generated chunk for this `transcription_id`
"data": "base64 string"
}
}
}

current_task

The get_task command response.

{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"msg": "value is not a valid enumeration member; permitted: 'audio'\", 'type': 'type_error.enum'",
"param": null
}
}

error:

Validation, authorization or other kinds of errors.

{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"param": null
}
}