Skip to Content

Asynchronous Voice Cloning

The asynchronous voice cloning API is designed for processing longer text content that requires more time to generate. This approach is ideal for:

  • Long-form content: Articles, books, or extensive documentation
  • Large-scale processing: Multiple audio files or batch operations
  • Background processing: Tasks that can run while users continue other activities

How It Works

The asynchronous API follows a three-step process:

  1. Create Task: Submit your audio sample and text for processing
  2. Monitor Progress: Check the status of your task periodically
  3. Download Result: Retrieve the generated audio when complete

API Endpoints

EndpointPurposeDescription
Create TaskSubmit processing requestUpload audio and text to start voice cloning
Task StatusMonitor progressCheck current status and estimated completion
Task ResultDownload audioRetrieve the generated voice clone

When to Use Async API

  • Text longer than 1000 characters
  • High-quality audio generation
  • Batch processing multiple texts
  • Applications that can handle delayed results

Processing Times:

  • Short texts (1000-5000 chars): 2-5 minutes
  • Medium texts (5000-15000 chars): 5-10 minutes
  • Long texts (15000+ chars): 10-20 minutes

Getting Started

To begin using the asynchronous API:

  1. Prepare your content: Ensure your text is over 1000 characters
  2. Upload audio sample: Provide a 5-30 second voice sample
  3. Submit task: Use the Create Task endpoint
  4. Monitor progress: Poll the Task Status endpoint
  5. Download result: Retrieve your audio via Task Result

For shorter texts, consider using the Synchronous API for immediate results.

Last updated on