Skip to main content

Clone Voice with API

This guide walks you through the process of cloning a voice using Palabra's API.
Voice cloning via API allows you to programmatically create a voice that replicates a specific speaker by submitting a short audio sample and related metadata.

Step 1: Get API credentials

  1. Log in to your Palabra account
  2. Go to the Palabra API section
  3. Create a new API key or use an existing one
  4. Copy your Client ID and Client Secret — you'll need them to authenticate requests

API credentials

Step 2: Prepare your audio sample

To ensure high-quality voice cloning, please follow the guidelines below when uploading your sample:

  • Accepted formats: MP3, WAV, FLAC, WEBM, MP4, MPEG, or MPG
  • Maximum file size: 10 MB
  • Minimum duration: 30 seconds
  • Audio quality: No background noise
  • Speaker requirement: Only one speaker per sample
  • Input types: Audio or video files are accepted

Step 3: Create the voice cloning request

Voice cloning through the API is performed in two steps:

Step 1: Submit voice cloning metadata

First, send a POST request to create a voice cloning task with the metadata of your audio sample.
At this stage, you do not upload the audio file itself — only its metadata is submitted.

Endpoint

https://api.palabra.ai/saas/voice/clone

Sample payload

{
"name": "My voice",
"samples": [
{
"filename": "20250611_1453_Recording.mp3",
"mime_type": "audio/mpeg",
"display_name": "My voice",
"description": "Description of my voice",
"denoise": false,
"lang_code": "en",
"speech_normalization": true
}
],
"description": "Description of my voice",
"labels": {
"gender": null,
"age_group": null,
"mood": null
}
}

Field descriptions

name (required)

A user-defined name for the cloned voice. This will be used to identify the voice in your Palabra account.

samples (required)

An array of one or more audio samples with metadata for each file.

  • filename (required)
    The original filename of the uploaded sample.

  • mime_type (required)
    MIME type of the file (e.g., audio/mpeg, audio/wav).

  • display_name (optional)
    Human-readable name to display in the UI.

  • description (optional)
    Additional information about the sample.

  • speech_normalization (optional)
    Whether to apply automatic speech normalization (true or false). Default is true.

  • denoise (optional)
    Whether to apply automatic denoising (true or false). Default is false.

  • lang_code (required)
    Language code of the speaker (e.g., en, uk). Used to optimize voice modeling.

description (optional)

A description of the cloned voice for internal reference.

labels (optional)

Optional metadata describing the speaker:

  • gender – One of male, female, or null.
  • age_group – One of kid, adult, senior, or null.
  • mood – One of neutral, happy, angry, sad, or null.

Note: The name and lang_code fields are required. All other fields are optional but recommended for better accuracy and organization.

Example: Voice cloning request

const payload = {
name: "My voice",
samples: [
{
filename: "20250611_1453_Recording.mp3",
mime_type: "audio/mpeg",
display_name: "My voice",
description: "Description of my voice",
denoise: false,
lang_code: "en"
}
],
description: "Description of my voice",
labels: {
gender: null,
age_group: null,
mood: null
}
};

const response = await fetch('https://api.palabra.ai/saas/voice/clone', {
method: 'POST',
headers: {
'ClientId': '<YOUR_CLIENT_ID>',
'ClientSecret': '<YOUR_CLIENT_SECRET>'
},
body: JSON.stringify(payload)
});

if (!response.ok) {
const errorText = await response.text().catch(() => response.statusText);
throw new Error(`Failed to clone voice: ${response.status} ${errorText}`);
}

Response

{
"utc_created_at": "2025-06-19T10:52:53.893244",
"voice_id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"user_id": "02117a4f-a847-4264-9807-704d279bbf3a",
"name": "My voice",
"voice_type": "instantly_cloned",
"processing_status": "created",
"description": "My voice",
"labels": {
"gender": null,
"age_group": null,
"mood": null
},
"lang_code": "en",
"samples": [
{
"item_id": "0",
"blob_id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
"url": "https://palabra-prod-web-cdn.s3.amazonaws.com/",
"form_data": {
"acl": "private",
"bucket": "palabra-prod-web-cdn",
"key": "blob/author/instant_voice_clone_upload_input_sample/02117a4f-a847-4264-9807-704d279bbf3a/7e8344fc-4408-4ef7-942b-45d641b2877e.mp3",
"x-amz-meta-blob-id": "7e8344fc-4408-4ef7-942b-45d641b2877e",
"x-amz-meta-filename": "20250611_1453_Recording.mp3",
"Content-Type": "audio/mpeg",
"x-amz-meta-user-id": "02117a4f-a847-4264-9807-704d279bbf3a",
"x-amz-meta-intent": "instant_voice_clone_upload_input_sample",
"x-amz-meta-voice-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"x-amz-meta-upload-id": "10545719-5dfb-4164-9b39-cc70ed2ff97d",
"x-amz-algorithm": "AWS4-HMAC-SHA256",
"x-amz-credential": "AKIAR3HUOH7XJLBFCRWH/20250619/eu-central-1/s3/aws4_request",
"x-amz-date": "20250619T105253Z",
"policy": "eyJleHBpcmF0aW9uIjogIjIwMjUtMDYtMTlUMTE6MDc6NTNaIiwgImNvbmRpdGlvbnMiOiBbeyJhY2wiOiAicHJpdmF0ZSJ9LCB7ImJ1Y2tldCI6ICJwYWxhYnJhLXByb2Qtd2ViLWNkbiJ9LCB7ImtleSI6ICJibG9iL2F1dGhvci9pbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUvMDIxMTdhNGYtYTg0Ny00MjY0LTk4MDctNzA0ZDI3OWJiZjNhLzdlODM0NGZjLTQ0MDgtNGVmNy05NDJiLTQ1ZDY0MWIyODc3ZS5tcDMifSwgeyJ4LWFtei1tZXRhLWJsb2ItaWQiOiAiN2U4MzQ0ZmMtNDQwOC00ZWY3LTk0MmItNDVkNjQxYjI4NzdlIn0sIHsieC1hbXotbWV0YS1maWxlbmFtZSI6ICIyMDI1MDYxMV8xNDUzX1JlY29yZGluZy5tcDMifSwgeyJDb250ZW50LVR5cGUiOiAiYXVkaW8vbXBlZyJ9LCB7IngtYW16LW1ldGEtdXNlci1pZCI6ICIwMjExN2E0Zi1hODQ3LTQyNjQtOTgwNy03MDRkMjc5YmJmM2EifSwgeyJ4LWFtei1tZXRhLWludGVudCI6ICJpbnN0YW50X3ZvaWNlX2Nsb25lX3VwbG9hZF9pbnB1dF9zYW1wbGUifSwgeyJ4LWFtei1tZXRhLXZvaWNlLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCB7IngtYW16LW1ldGEtdXBsb2FkLWlkIjogIjEwNTQ1NzE5LTVkZmItNDE2NC05YjM5LWNjNzBlZDJmZjk3ZCJ9LCBbImNvbnRlbnQtbGVuZ3RoLXJhbmdlIiwgMTA0ODUsIDMzNTU0NDMyXSwgeyJidWNrZXQiOiAicGFsYWJyYS1wcm9kLXdlYi1jZG4ifSwgeyJrZXkiOiAiYmxvYi9hdXRob3IvaW5zdGFudF92b2ljZV9jbG9uZV91cGxvYWRfaW5wdXRfc2FtcGxlLzAyMTE3YTRmLWE4NDctNDI2NC05ODA3LTcwNGQyNzliYmYzYS83ZTgzNDRmYy00NDA4LTRlZjctOTQyYi00NWQ2NDFiMjg3N2UubXAzIn0sIHsieC1hbXotYWxnb3JpdGhtIjogIkFXUzQtSE1BQy1TSEEyNTYifSwgeyJ4LWFtei1jcmVkZW50aWFsIjogIkFLSUFSM0hVT0g3WEpMQkZDUldILzIwMjUwNjE5L2V1LWNlbnRyYWwtMS9zMy9hd3M0X3JlcXVlc3QifSwgeyJ4LWFtei1kYXRlIjogIjIwMjUwNjE5VDEwNTI1M1oifV19",
"x-amz-signature": "b8e3c8607d7b9c66ce92da2f8a0a4b2dbd3578fdb87f671569f5208b343c7a22"
}
}
]
}

Step 2: Upload the audio file

Use the url and samples fields returned in Step 1 to upload your audio file via POST.

Example: Upload request

async function uploadFile(sample, file) {
const formData = new FormData();

for (const [key, value] of Object.entries(sample.form_data)) {
formData.append(key, value);
}

formData.append('file', file, file.name);

const response = await fetch(sample.url, {
method: 'POST',
body: formData,
headers: {
'ClientId': '<YOUR_CLIENT_ID>',
'ClientSecret': '<YOUR_CLIENT_SECRET>'
},
});

if (!response.ok) {
let errorText;
try {
errorText = await response.text();
} catch {
errorText = response.statusText;
}
throw new Error(`Failed to upload file: ${response.status} ${errorText}`);
}
}

Once the file is successfully uploaded, the system will automatically begin processing the sample. You can track the status of the voice cloning task via the https://api.palabra.ai/saas/voice/m/${id} endpoint.