V-Spark Online Help

Audio Requirements

An audio file's format and other properties have a significant impact on the accuracy of ASR transcription, and the level accuracy for a given transcript affects analytics performance. The best format for audio submitted for transcription and analysis is a lossless G.711 WAV (PCM, uLaw, or aLaw). V‑Spark supports a wide variety of audio types because it converts audio before passing it to the ASR engine for transcription. The best way to check audio file compatibility is with the Audio Evaluator.

Although V‑Spark converts audio for transcription, that conversion cannot account for voice data lost due to suboptimal recording and encoding practices. Refer to the Improving Transcription Accuracy section of the V‑Blaze User Guide for more information about how an audio file's channels and other properties impact ASR transcription.

The number of channels in an audio file also has significant impact on transcription and analysis. Audio submitted for transcription and analysis must have one or two channels. The number of channels in source audio, along with how those channels are used, affects V‑Spark's ability to distinguish between speaker roles. In most cases, these roles are agent and client, and distinguishing between the two is critical for transcript analysis.

Transcription and analysis work best with two-channel (stereo) audio that has each speaker role on a separate channel. Audio with more than one speaker on the same channel may be diarized, a process that separates the audio into two channels and assigns each speaker to a different channel.

Important

V‑Spark does not support audio with more than 2 channels.

Audio Filenames

The names of uploaded audio files and zip archives must adhere to the installation's filename requirements whether they are uploaded through the GUI or API. When uploading a zip file, only the name of the zip file is validated against this expression; files inside the zip are not checked. This feature was first implemented with version 4.0.1-3 to help protect against remote code executions. By default, these characters are not permitted in uploaded filenames: #*<>:?/\|{}$!'`"=^

To disable filename validation, set the filename_validation system configuration setting to off. To define custom filename character requirements, specify a regular expression for the filename_validation_pattern system configuration setting.

Files are not required to have unique names at a system level, but as of V‑Spark 4.2.0-1, individual folders may be configured to reject files with duplicate filenames. In either case, filenames should be unique as a best practice. Consider adding the file's timestamp, call ID, or a UUID to create a unique filename. Duplicate filenames make some processing take longer.

Note

If two files with identical names are submitted to the same folder at the exact same second, only one of those files will be processed.

When a folder has the deduplication setting enabled, that folder will reject file uploads in the following scenarios:

  • A file is uploaded with the same name as a previously uploaded file.

  • A zip file contains a file with the same name as a previously uploaded file.

  • A zip file contains two or more files with the same name.

Note that the entire zip is rejected when a duplicate file is detected, and duplicate file rejection for zip files nested inside other zip files is not supported. When folder-level deduplication causes a file to be rejected, V‑Spark generates a WARNING-level message in server.log and the Activity Log.

When duplicate file rejection is enabled for a folder, audio may still be reloaded for reprocessing, but reloaded MP3 audio data will be rejected when the original audio file was also an MP3. A previously processed file that was originally uploaded in any other format is accepted. Reloading JSON data is not affected by the folder-level duplicate rejection setting.