Speech to Text

Asynchronously transcribe audio to text, returning the full text, per-segment timeline, and detected language. Poll the returned task_id after submitting.

Base URL: https://api.aiclonevoicefree.com | Auth: Authorization: Bearer sk_...

`POST /api/v2/voice/transcribe`

Field	Type	Required	Notes
`audio_url`	string	✅	Audio URL (alias `url` also accepted)
`duration_seconds`	number	✅	Audio length (s), used for per-second billing
`language`	string	⬜	Language code; omit to auto-detect

Billing

1 credit/second of audio (cost = ceil(duration_seconds)). Pre-deducted with a balance check at submit (402 if insufficient); auto-refunded on failure.

curl -X POST https://api.aiclonevoicefree.com/api/v2/voice/transcribe \
  -H "Authorization: Bearer sk_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{ "audio_url": "https://your-cdn.com/speech.mp3", "duration_seconds": 42 }'

Response 202 → { "task_id": "...", "status": "pending", "capability": "voice", "action": "transcribe" }

Getting the result

Poll GET /api/v2/voice/transcribe/{task_id} (note: transcription has its own status endpoint, not the unified /tasks):

{
  "task_id": "...",
  "status": "completed",
  "capability": "voice",
  "action": "transcribe",
  "_type": "voice.transcribe",
  "text": "full transcript...",
  "language": "en",
  "segments": [{ "start": 0.0, "end": 3.2, "text": "..." }]
}

POST /api/v2/voice/transcribe

Billing

Getting the result

On this page

`POST /api/v2/voice/transcribe`