/transcribe API Reference
V‑Spark uses the HTTP POST method to submit audio and optional metadata files for processing. Please refer to theV‑Spark Management Guide for more comprehensive information about supported audio formats, filename requirements, and metadata formatting details.
Uploading individual or multiple files
When using the /transcribe
API to submit files for transcription, single audio files and JSON transcripts can be submitted individually. Files submitted individually will not be associated with each other.
Multiple files can be submitted in a single POST request, but they must be encapsulated into a single zip file. These zip files can contain both audio data and metadata. Audio files and metadata files submitted as parts of a zip file will remain associated with each other as parts of a single submission.
Note
The maximum size of a file that can be submitted for transcription using the /transcribe
API is 250 MB.
Note
Any metadata that you provide must be formatted as described in the Metadata Management section of the V‑Spark Management Guide.
Tip
V‑Spark's GUI enables you to submit individual files in various formats to a specific folder. Use the menu's command to display your folders, then click the button to the right of a folder's name.
See the Audio Management section of the V‑Spark Management Guide for more information.
File name rules
The names of uploaded audio files and zip archives must adhere to the installation's filename requirements whether they are uploaded through the GUI or API. By default, these characters are not permitted in uploaded filenames: #*<>:?/\|{}$!'`"=^
Filename validation can be disabled by setting the filename_validation configuration setting to off. Custom filename character requirements can be defined by specifying a regular expression via filename_validation_pattern.
A request submitted to the /transcribe endpoint with an invalid filename parameter will fail with HTTP error code 422.
Transcription options
All parameters that control transcription options are specified in the V‑Spark Folder definition. These include the language models used to decode each audio channel, number of speakers, number of audio channels (i.e. mono or stereo), etc. It is therefore unnecessary to provide these parameters when POSTing files to V‑Spark.
Authorization token
You can use either the root token for your V‑Spark installation or the token for the company that is associated with the organization and folder to which you are submitting your transcription request. See V‑Spark API Permission Requirements for information about locating these tokens and the rights that these tokens give you.
Example POST request using a zip file
When using the /transcribe
API to submit zip files for transcription of the audio files that they contain, the POST must be encoded as a multipart/form-data request, with the zip file name provided in a file field and a V‑Spark authorization token provided in the token field.
The following is an example of calling the /transcribe
API method using the cURL command-line utility:
curl -F token=0123456789abcde0123456789abcde01
\ -F "file=@/path/to/audio_and_meta.zip
;type=application/zip" \ -X POST https://hostname
/transcribe/org_shortname
/folder_name
The cURL utility is freely available for operating systems including Linux, Windows, and macOS.
Note
Items shown as replaceable
in the sample cURL command are example settings only and must be replaced with real values that are appropriate for your environment.
In the example command, note that org_shortname
refers to the Short Name assigned to the target Organization, which can be found on the V‑Spark Settings page in the Organization section of V‑Spark. The folder
refers to the folder for the organization into which you want to upload the audio that is contained in the zip file that you are uploading.
![]() |
The cURL command exits after transmission of the zip file to the V‑Spark instance has completed.
Next steps
The POST returns a universally unique identifier (UUID) that identifies the transcription request. All transcripts produced as a result of the request will include a requestid
field with its value set to this UUID. The requestid
enables you to correlate individual transcripts with specific transcription requests.
Once the audio has been transcribed, the transcripts (along with optional metadata) are loaded into V‑Spark. Transcripts from any given request can be retrieved using the aforementioned UUID with the /request endpoint, or by using a callback server. Please refer to the V‑Spark Review and Analysis Guide for details regarding browsing, searching, and analyzing the calls and metadata within V‑Spark.