Translation management API
Control Palabra real-time translation pipeline over WebSockets or a WebRTC data channel.
1. Prerequisites
Before you connect, make sure you have the following:
webrtc_url
– The URL of the Palabra WebRTC server.ws_url
– The URL of the Palabra WebSocket server.publisher
access token – A JWT used to authorize your connection.
All three values are returned when you create a streaming session.
2. Choose a transport
2.1 Option 1. WebRTC
- Connect with any LiveKit client to
webrtc_url
using yourpublisher
token. - Once the connection is open, you can start sending commands through the default (empty-topic) WebRTC data channel.
2.2 Option 2. WebSockets
Connect to ws_url
, passing your publisher
token as a query parameter:
// WebSocket control URL
const endpoint = `${ws_url}?token=${publisher}`;
const socket = new WebSocket(endpoint);
Once the connection is open, you can start sending commands.
3. Message format (WebRTC & WebSockets)
Every API packet—request and response has the same envelope:
{
"message_type": "<string>",
"data": { /* payload */ }
}
If message_type
is "error"
, the data
field contains diagnostic information.
4. Typical workflow
- Create task - send a
set_task
message type to start the translation. - Update task – Send another
set_task
message to update the translation settings during an ongoing translation. - Pause processing - send a
pause_task
message type to pause the translation (stops billing). Resume with anotherset_task
. - Finish task - send an
end_task
(the server will close your connection automatically; session will be invalidated in 1 minute).
5. Message Settings reference
- What each field means - see the translation settings breakdown.
- Recommended values - see the best-practice settings.
6. Streaming audio configuration
6.1 Option 1. WebRTC audio I/O configuration
Use the following input/output streams configuration in your set_task
:
{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "webrtc"
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "webrtc"
}
}
// ...
}
}
- Publish your microphone track to LiveKit Room.
- Subscribe to the translation tracks that Palabra will publish in the same LiveKit Room after you send the
set_task
message.
6.2 Option 2. WebSocket audio I/O configuration
Use this configuration instead:
{
"data": {
"input_stream": {
"content_type": "ws",
"source": {
"type": "ws",
"format": "opus", // or pcm_s16le, wav
"sample_rate": 24000, // 16000 - 24000
"channels": 1 // 1 or 2
}
},
"output_stream": {
"content_type": "ws",
"target": {
"type": "ws",
"format": "pcm_s16le" // or zlib_pcm_s16le
}
}
}
}
- Send base-64 audio chunks that exactly match the declared format.
- Receive base-64 TTS chunks in
output_audio_data
responses:
{
"message_type": "output_audio_data",
"data": {
"transcription_id": "190983855fe3404e",
"language": "es",
"last_chunk": false,
"data": "<base64-encoded audio>"
}
}
7. API messages
Request message schema
Response message schema
7.1 Requests (client → server)
Message type | Short description |
---|---|
set_task | Create/update translation task |
end task | Finish translation task |
get_task | Return current task |
pause_task | Pause current task, use set_task to continue |
tts_task | Generate TTS from text |
input_audio_data | Input audio data chunk (Websockets audio transport only) |
set_task
Create a new task or modify the current one.
- Sending for the first time after creating a session - starts the translation.
- Sending after
pause_task
- resumes the translation. - Sending another
set_task
message during an ongoing translation updates the current translation settings in real time—no need to stop the translation.
{
"message_type": "set_task",
"data": {
"input_stream": { /* Depending on transport, see the audio I/O section above */ },
"output_stream": { /* Depending on transport, see the audio I/O section above */ },
"pipeline": {
"transcription": {
"source_language": "string",
"detectable_languages": ["string"],
"segment_confirmation_silence_threshold": "float",
"only_confirm_by_silence": "bool",
"sentence_splitter": {
"enabled": "bool"
},
"verification": {
"auto_transcription_correction": "bool",
"transcription_correction_style": "string"
}
},
// Translation and speech generation settings for one or more target languages
"translations": [
{
"target_language": "string",
"translate_partial_transcriptions": "bool",
"speech_generation": {
"voice_cloning": "bool",
"voice_id": "string",
"voice_timbre_detection": {
"enabled": "bool",
"high_timbre_voices": [],
"low_timbre_voices": []
}
}
}
// You can add more targets
],
"translation_queue_configs": {
"global": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int",
"auto_tempo": "bool"
},
"es": {
"desired_queue_level_ms": "int",
"max_queue_level_ms": "int"
}
},
// Select response types to receive
"allowed_message_types": [
"translated_transcription",
"partial_transcription",
"partial_translated_transcription",
"validated_transcription"
]
}
}
}
Check out the Translation settings breakdown for details on each field and Recommended Settings of the translation's task pipeline..
Need
ASR only
mode? Do either:
- set
output_stream
tonull
(you will still get text translations, but no TTS audio)- use an empty
translations
list (no translations and no TTS).
end_task
Finish the current task. The server closes the connection after receiving end_task
.
{
"message_type": "end_task",
"data": { "force": false } // set true to skip finalization of the last phrase
}
pause_task
Pause the current task. No audio data is processed and no billing while the task is paused. Use set_task
to resume translation.
{ "message_type": "pause_task", "data": {} }
get_task
Return the current task.
{ "message_type": "pause_task", "data": {} }
tts_task
Generates TTS from a text. It will be translated to all target_language
in translations
task section.
{
"message_type": "tts_task",
"data": {
"text": "Hello, how are you?",
"language": "en" // text language
}
}
input_audio_data
Used to send input base64 encoded audio data chunk whe Websockets selected as audio transport.
The audio chunks you push must match the format / sample_rate / channels
you declare in your
set_task
command. The optimal chunk length is 320ms.
{
"message_type": "input_audio_data",
"data": {
"data": "base64 encoded data"
}
}
7.2 Responses (server → client)
Message type | Short description |
---|---|
partial_transcription | Unconfirmed ASR segment |
partial_translated_transcription | Unconfirmed translation segment |
validated_transcription | Final ASR segment |
translated_transcription | Final translation |
output_audio_data | Chunk of generated TTS audio (WebSockets audio transport only) |
current_task | get_task command response |
error | Validation or runtime error |
To receive
partial_transcription
,validated_transcription
, andtranslated_transcription
messages, you must include these message types in the allowed_message_types field of yourset_task
command.To receive
partial_translated_transcription
messages, you must must include it in the allowed_message_types field AND set translate_partial_transcriptions totrue
in yourset_task
command.
partial_transcription
Uncompleted segment transcription:
{
"message_type": "partial_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two"
}
}
}
partial_translated_transcription
Uncompleted segment translation.
{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois,"
}
}
}
validated_transcription
Completed segment transcription.
{
"message_type": "validated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "en",
"text": "One, two, three, four, five."
}
}
}
translated_transcription
Completed segment translation:
{
"message_type": "translated_transcription",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es",
"text": "Um, dois, três, quatro, cinco."
}
}
}
output_audio_data
TTS audio chunk (if you use Websockets as audio transport).
{
"message_type": "output_audio_data",
"data": {
"transcription": {
"transcription_id": "190983855fe3404e",
"language": "es", // TTS language
"last_chunk": false, // Last generated chunk for this `transcription_id`
"data": "base64 string"
}
}
}
current_task
The get_task
command response.
{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"msg": "value is not a valid enumeration member; permitted: 'audio'\", 'type': 'type_error.enum'",
"param": null
}
}
error
:
Validation, authorization or other kinds of errors.
{
"message_type": "error",
"data": {
"code": "VALIDATION_ERROR",
"desc": "ValidationError(model='SetTaskMessage', errors=[{'loc': ('input_stream', 'content_type')",
"param": null
}
}