V-Blaze and V-Cloud Online Help

params

URL: http://vblaze_name:17171/params

Example Response:

      {
        "params": {
          "activitylevel": 175,
          "bufmaxtime": 30,
          "endian": "LITTLE",
          "idletimeout": 30,
          "languages": [
            "eng1"
          ],
          "model": "eng1:callcenter",
          "models": [
            "eng1:callcenter",
            "eng1:survey",
            "eng2:voicemail"
          ],
          "numtrans": true,
          "outputdir": "/opt/voci/ramfs",
          "punctrailing": 12,
          "punctuate": true,
          "pushconntimeout": 5,
          "queue": "bottom",
          "raw_events": false,
          "realtime": false,
          "recvtimeout": null,
          "scrubmindist": 0.3,
          "uttmaxsilence": 800,
          "uttmaxtime": 80,
          "uttminactivity": 500,
          "uttpadding": 300,
          "vadparams": {}
        }
      }
    

Explanation:

The example response is a JSON object that shows the names of audio-independent parameters that can be specified when initiating a transcription session and their default values.

activitylevel

Indicates the voice activity level below which the ASR engine considers any audio event to be silence.

bufmaxtime

Indicates the current buffer size configuration, which is the maximum amount of audio (seconds) to analyze for diarization.

endian

Indicates the byte ordering of audio samples. In a BIG endian data word, the most significant byte comes first, when reading from left to right. In a LITTLE endian data word, the least significant byte comes first. By convention, LITTLE endian (the default) is the most common.

idletimeout

Indicates the amount of time an API request will wait for a response before timing out.

languages

Indicates the languages that are available. The languages indicated are components of the model used by the ASR engine for transcription. These components convert the stream of sound symbols from an acoustic model into text.

models

Indicates the default language model, along with the models available for transcription requests.

The remaining JSON keys and values are audio-independent parameters and their default values. Most of these parameters never require modification, but are provided to enable tuning for special circumstances, such as aggressive real-time applications.

numtrans

Indicates the default value of numtrans. The numtrans parameter controls whether or not number words in transcribed text are converted into numeric digits and related conventional formats, including dollar amounts, wall-clock times, percentages, ordinals, and telephone numbers.

outputdir

Indicates the directory where intermediate processing and result data is stored.

punctrailing

Indicates the amount of words required for a sentence to be created within the punctuation engine. If punctrailing is 12, then the punctuation engine waits until there are at least 12 words after a sentence before finalizing that sentence.

punctuate

Indicates the default value of punctuate. The punctuate parameter controls whether transcript text is punctuated or not. In most cases, it is desirable to leave punctuation turned on, but there are special cases where it should be turned off.

pushconntimeout

Indicates the number of seconds to wait for a push data source to initiate a connection to the decode server. pushconntimeout cannot exceed 120 seconds.

queue

Indicates the default value of queue. The queue parameter determines the order of requests in transcription queue. Setting the value to bottom inserts the stream at the end of the queue which means transcripts are processed in the order they are received. Setting the value to top inserts the stream at the beginning of the queue. Setting queue to top is useful for skipping the queue when submitting high priority jobs.

raw_events

Indicates the default value of raw_events. The raw_events parameter includes an additional raw_events list in the JSON output under utterances. This includes silence, filler words, wordex, and un-punctuated text.

realtime

Indicates the default value of realtime. The realtime parameter controls whether or not the ASR engine is processing incoming audio in real-time mode or not.

recvtimeout

Indicates the amount of time (milliseconds) before timing out when receiving audio data. The default value is set to null or 0 for no timeout.

scrubmindist

Indicates the default value of scrubmindist. The scrubmidist parameter specifies the number of seconds within which two scrubbed audio sections will be merged when the scrubaudio parameter is set to true.

uttmaxsilence

Indicates the default value of uttmaxsilence. The uttmaxsilence parameter specifies the maximum amount of silence in milliseconds that can occur between speech sounds without terminating the current utterance. Once a silence occurs that exceeds uttmaxsilence milliseconds, an utterance “cut” is made within the detected silent region.

uttmaxtime

Indicates the default value for uttmaxtime. The uttmaxtime parameter specifies the maximum amount of time in seconds that is allotted for a spoken utterance. Normally an utterance is terminated by a sufficient duration of silence, but if no such period of silence is encountered prior to reaching uttmaxtime, the utterance is terminated forcibly.

uttminactivity

Indicates the default value of uttminactivity. The uttminactivity parameter specifies how much activity is needed (without uttpadding) to classify as an utterance.

uttpadding

Indicates the default value of uttpadding. The uttpadding parameter specifies how much padding around the active area to treat as active. Typically the higher the activitylevel, the more padding is needed. Lower activity levels require less padding.

vadparams

Indicates the parameters provided to the voice activity detection (VAD) engine of V‑Blaze.

Both audio-independent and audio-dependent parameters are discussed in V‑Blaze Transcription Parameters.