V-Blaze and V-Cloud Online Help

Language Model Configuration

The models.list file is located in /opt/voci/. It contains options to configure language model sizing and availability parameters. A common use case for configuring models.list is to specify which models to load at startup and which models to load when requested (dynamic loading). Dynamic loading is available for language models only; all acoustic models load at startup.

General Behavior

V‑Blaze automatically attempts to optimize GPU memory usage on system startup by loading all the necessary models. If the system runs out of memory, it will restart in safe mode, end all acoustic model processes, and set numgpus to 1. One of each model loads to measure the memory requirements and optimize loading behavior accordingly. The system will restart when it is done calculating optimal memory sizing for the models. The following entries are logged to /var/log/vociserver/err.log during the memory sizing process:

Decode server failed to start. Restarting in safe mode.

Decode server successfully started in safe mode. Restarting.

Note

V‑Blaze attempts to adhere to the settings specified in models.list, however, reductions may occur until all models are able to load.

models.list

If no models.list configuration is provided, the system will attempt to load all available language models and their corresponding acoustic models. If licensing restricts certain installed models, a subset of licensed models will load instead. The following models.list configuration limits the models loaded at startup to eng1:callcenter and spa1:callcenter (if available and licensed):

eng1:callcenter
spa1:callcenter

Note

  • Models can expire even if they are loaded at startup by default.

  • If an installed model is excluded then the associated acoustic model won't load.

Language #config parameter syntax (optional):

  • memfree=N — Defines the minimum amount of system memory to keep available. The ASR engine will attempt to unload the least used language models until the memfree specification is met.

  • expire=H — Hours of idle time after which a model should be automatically unloaded (default is to never unload because of idle time).

  • all=no|demand|load — The value defined in all applies to all available language models. The following list explains each value:

    • load — Load the specified language model at startup (default).

    • demand — Load the specified language model if requested by a stream.

    • no — Never load the specified language model. If no model lines are provided, all models available on the system will be loaded (or made available on-demand with all=demand)

General model parameter syntax: model [load|demand|no]

  • load — Load the specified language model at startup (default).

  • demand — Load the specified language model if requested by a stream.

  • no — Never load the specified language model. If no model lines are provided, all models available on the system will be loaded (or made available on-demand with all=demand)

  • cycle=N — The value defined in cycle (number of hours) specifies when to reload language models to optimize memory use.

  • numgpus=N — Define number of GPU processes for associated acoustic models. Acoustic processes are shared among all language models that use the same acoustic model. numgpus is calculated using the following formula:

    • roundup(min(maxdecoders,ncpus) / 4

      • roundup() — Round up if the final value is fractional

      • min() — Select the lowest of the two values

      • maxdecoders — Number of language model decoding processes to start

      • ncpus — Number of CPU cores

Sample models.list

The following is a sample models.list file:

#config memfree=5 expire=12 all=demand 1
eng1:callcenter load 2
eng1:* demand 3

Important

The #config statement is not a comment. It must be present to configure overall language model loading behavior.

1

The first line represents a global configuration command with the following settings:

  • memfree=5 — ensures that there is at least 5 GB of free memory on the system.

  • expire=12 — automatically unload any language model if it has not been used in 12 hours.

  • all=demand — never load a model unless requested.

2

The second line instructs vociserver to automatically load eng1:callcenter into memory when the vociserver service starts up. Model line syntax is: lang:model_name option. The default option is load. If no model lines are provided, all models available on the system will be loaded (or made available on-demand with all=demand).

3

The third line uses wildcard matching to instruct vociserver to load all eng1 language models. Wildcard matching (*) is supported in lang:model_name. The first matched line is prioritized, so ordering matters, especially with wildcards.

Complex models.list Sample

The following sample is a more complex models.list file:

#config memfree=5 expire=12 cycle=1 numgpus=2
eng1:callcenter load numgpus=2
eng2:callcenter demand numgpus=1 expire=2

The configuration in the sample above results in the following:

  • Leave 5 GB of system memory free.

  • Unload models after 12 hours idle time.

  • Reload language models every hour.

  • Load other acoustic models with 2 GPU processes.

  • Load eng1:callcenter at startup with 2 GPU processes for acoustic models.

  • Load eng2:callcenter when requested with 1 GPU process for acoustic models and unload after 2 hours of idle time.