Skip to Content

Create Voice Cloning Task

Start a voice cloning task. You can upload audio files directly or provide URLs to existing audio files.

Request Information

  • Method: POST
  • Endpoint: /api/instant/create-task
  • Content Type: multipart/form-data or application/json

Request Parameters

ParameterTypeRequiredDescription
audioFile (binary)Yes*Audio file for voice cloning. Supported formats include WAV, MP3, and M4A. Only supports multipart/form-data format. You must provide either audio or audio_url.
audio_urlstringYes*Publicly accessible audio file URL (WAV, MP3, M4A). Supports both multipart/form-data and application/json formats. You must provide either audio or audio_url.
textstringYesThe text you want to synthesize with the cloned voice.
api_keystringYesYour unique API key for authentication and access. This key is used to verify your request and link it to your user account.
modelstringNoVoice cloning model version. Options: v1, v2, v-mul. Default is v2. Different models vary in audio quality, processing speed, and multilingual support.
speed_ratiofloatNoSpeed ratio, range 0.5-2.0, default is 1.0
pitch_ratiofloatNoPitch offset, range -10 to 10 semitones, default is 0
volume_ratiofloatNoVolume ratio, range 0.1-2.0, default is 1.0
emotion_controlobjectNoEmotion control parameters (V2 model only). Supports multiple emotion control modes, see below for details

*Note:

  • Either audio or audio_url parameter is required, at least one must be provided
  • When using application/json format, only audio_url parameter is supported
  • When using multipart/form-data format, both parameters are supported

Response

Success Response

{ "task_id": "1406bf34-735c-4b21-98ac-a135b2afb1c8", "status": "pending" }

Error Response

  • 400 Bad Request: Missing required parameters (e.g., api_key, or neither audio nor audio_url provided)

Example Requests

Using Audio File

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -F "[email protected]" \ -F "text=This is a long text suitable for async interface processing..." \ -F "api_key=your_api_key_here"

Using Audio URL

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -F "audio_url=https://example.com/sample.mp3" \ -F "text=This is a long text suitable for async interface processing..." \ -F "api_key=your_api_key_here"

Using Specified Model

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -F "[email protected]" \ -F "text=This is a long text suitable for async interface processing..." \ -F "api_key=your_api_key_here" \ -F "model=v2"

Using Audio Processing Parameters

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -F "[email protected]" \ -F "text=This is a long text suitable for async interface processing..." \ -F "api_key=your_api_key_here" \ -F "model=v2" \ -F "speed_ratio=1.2" \ -F "pitch_ratio=2" \ -F "volume_ratio=1.5"

Using JSON Format (with audio_url)

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -H "Content-Type: application/json" \ -d '{ "audio_url": "https://example.com/sample.mp3", "text": "This is a long text suitable for async interface processing...", "api_key": "your_api_key_here", "model": "v2", "speed_ratio": 1.2, "pitch_ratio": 2, "volume_ratio": 1.5 }'

Model Selection Guide

The API supports three voice cloning models that you can choose from based on your needs:

ModelDescriptionUse Cases
v2Second generation model (default), balances audio quality and processing speedRecommended for most scenarios, provides high-quality voice cloning results
v1First generation model, faster processing speedSuitable for scenarios with high processing speed requirements
v-mulMultilingual model, supports cross-language voice cloningSuitable for applications requiring multilingual support

Note: If the model parameter is not specified, the system will use the v2 model by default.

Emotion Control Parameters (emotion_control)

V2 model only supports emotion control functionality through the emotion_control object. This can be passed directly in JSON format requests.

Emotion Control Modes

Modemode ValueDescriptionAdditional Parameters
Same as Referencesame_as_referenceUse the emotion from reference audio (default)None
Reference Audioreference_audioUse specified reference audio to control emotionreference_audio_url
Vector ControlvectorUse 8-dimensional emotion vector to control emotionvector object
Text DescriptiontextUse text description to control emotiontext string
Random EmotionrandomRandomly generate emotionNone

8-Dimensional Emotion Vector (vector mode)

When using vector mode, you can precisely control the intensity of 8 emotions, each value ranges from 0.0 to 1.0:

{ "joy": 0.5, // Joy (0.0-1.0) "anger": 0.0, // Anger (0.0-1.0) "sorrow": 0.0, // Sorrow (0.0-1.0) "fear": 0.0, // Fear (0.0-1.0) "excitement": 0.3, // Excitement (0.0-1.0) "depression": 0.0, // Depression (0.0-1.0) "surprise": 0.2, // Surprise (0.0-1.0) "calm": 0.0 // Calm (0.0-1.0) }

Emotion Control Examples

Using Vector Control (Happy + Excited)

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -H "Content-Type: application/json" \ -d '{ "audio_url": "https://example.com/sample.mp3", "text": "What a beautiful day!", "api_key": "your_api_key_here", "model": "v2", "emotion_control": { "mode": "vector", "vector": { "joy": 0.8, "excitement": 0.6, "surprise": 0.3, "calm": 0.2 } } }'

Using Text Description Control

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -H "Content-Type: application/json" \ -d '{ "audio_url": "https://example.com/sample.mp3", "text": "This is absolutely amazing!", "api_key": "your_api_key_here", "model": "v2", "emotion_control": { "mode": "text", "text": "excited and happy" } }'

Using Reference Audio Control

curl -X POST https://aivoiceclonefree.com/api/instant/create-task \ -H "Content-Type: application/json" \ -d '{ "audio_url": "https://example.com/sample.mp3", "text": "I feel very happy", "api_key": "your_api_key_here", "model": "v2", "emotion_control": { "mode": "reference_audio", "reference_audio_url": "https://example.com/emotion-ref.wav" } }'

Task Status Description

After creating a task, you will receive a task_id and initial status pending. Possible status values:

StatusDescription
pendingTask submitted, waiting for processing
processingTask is being processed
completedTask completed
failedTask processing failed

Next Steps

After successful task creation:

  1. Save the returned task_id
  2. Use Task Status Query to monitor progress
  3. Download audio via Get Task Result after completion

Best Practices

  • Text Length: Recommended single task text length should not exceed 10,000 characters
  • Audio Quality: Use high-quality audio samples for better cloning results
  • Request Format:
    • Using multipart/form-data: Can directly upload audio file (audio parameter) or use audio URL (audio_url parameter)
    • Using application/json: Can only use audio URL (audio_url parameter)
  • API Limits: Be aware of API call frequency limits, avoid overly frequent requests
Last updated on