V-Blaze and V-Cloud Online Help

Adjusting for Different Types of Input

Table 1. Adjusting for different types of input





false (default), true, noise

Diarization is the process of recognizing distinct speakers on a single (mono) audio channel and segmenting detected speech into separate channels, which are identified in JSON output. Voci’s diarization capability is designed to do this for two speakers, typically a call agent engaged in a conversation with a client over the phone.

You should only set diarize to true under the following conditions:

  • You know that your audio only contains a single audio channel.

  • You know that 2 people are talking on the channel.

  • Segregation of 2 speakers in the transcripts is important for your use case.

Enabling diarize will include the following fields in JSON output:

  • diascore — Indicates the system's level of confidence that it correctly classified detected speech into individual channels. The confidence level is expressed as a range between 0 and 1, where 1 indicates the best speaker separation. Refer to Confidence Scores for more information on the confidence scoring system.

  • chaninfo — Provides additional information specific to each channel. chaninfo only appears for stereo or diarized audio. Refer to Top-level Elements for more information.

The noise setting is typically not needed. However, if you are experiencing excessive diarization errors due to interference from non-speech sources, you can apply noise reduction by setting diarize=noise.


Music identification is recommended instead of diarize=noise for noise filtration.


Redaction accuracy is marginally reduced when used in combination with diarize. Avoid diarization when using any of the redaction options for maximum redaction accuracy.

Diarization is a licensed optional feature.


false (default), true

Determines whether V‑Cloud should use its built-in decoders to try to convert incoming audio into a supported format, if necessary. This option cannot be used with the truncate option.

Transcode functionality supports an extensive set of open audio formats. Submit your audio in a request to determine if the audio format is supported.