V-Blaze and V-Cloud Online Help

/transcribe Endpoint

The /transcribePOST method accepts a single audio file or a zip file containing one or more audio files to be processed, and returns a requestid string for tracking purposes. If you specified a callback, results are posted to the callback URL provided in the callback parameter immediately after successful processing.

The /transcribe endpoint enables you to submit files for transcription.


POSTing to the /transcribe endpoint requires the following parameters to provide authentication information and to identify the audio or zip file that you are attempting to upload.

  • token (required) – string used to authenticate and authorize the request. You will receive a token from Voci Support to use with requests to the V‑Cloud API. All requests made with your token will be tied to your account. Please notify Voci Support immediately if your token is compromised or lost.

    Example of token format (non-functional token, do not use): 123e4567e89b12d3a456426655440000

  • file (required) – a single audio file or zip file that contains one or more audio files to process. Supported zip file formats are zip (MIME type application/zip) and 7z (MIME type application/x-7z-compressed). The zip file can be password encrypted, with the zpass  parameter specifying the password.

    The Linux file command gives the following outputs for accepted audio file formats:

    $ file example1.wav
    example1.wav: RIFF (little-endian) data, WAVE audio, Microsoft PCM, 8 bit, mono 8000 Hz
    $ file example2.wav
    example2.wav: RIFF (little-endian) data, WAVE audio, ITU G.711 mu-law, mono 8000 Hz
    $ file example3.wav
    example3.wav: RIFF (little-endian) data, WAVE audio, ITU G.711 A-law, mono 8000 Hz

    The key compatibility indicators are RIFF, WAVE, Microsoft PCM, and ITU G.711. The sample rate may be between 8000 and 16000 Hz, where 8000 Hz is preferable. The file may also contain more than one channel.


    Transcoding from other audio formats is available as an optional licensed feature of V-Cloud. See Adjusting for Different Types of Input for more information.

  • callback (optional) — URL endpoint for V‑Cloud to POST results to when processing is complete. Can be either an HTTP or HTTPS endpoint.

  • model (optional) — string to indicate which language model should be used to transcribe the audio. If not specified, the default model will be used. You will receive model names which are authorized for your account from Voci Support.

  • output (optional — json, jsontop, or text) - string to indicate the desired output format. See output for detailed information about each of these possible values. See the Voci JSON Output Format Guide for detailed descriptions of all JSON fields.

This section only discussed mandatory or primary V‑Cloud options. See V‑Cloud Transcription Parameters for information about other parameters that you can pass in order to produce specific content in specific formats in your transcripts.