Language Model Configuration
The models.list
file is located in /opt/voci/
. It contains options to configure language model sizing and availability parameters. A common use case for configuring models.list
is to specify which models to load at startup and which models to load when requested (dynamic loading). Dynamic loading is available for language models only; all acoustic models load at startup.
General Behavior
V‑Blaze automatically attempts to optimize GPU memory usage on system startup by loading all the necessary models. If the system runs out of memory, it will restart in safe mode, end all acoustic model processes, and set numgpus
to 1
. One of each model loads to measure the memory requirements and optimize loading behavior accordingly. The system will restart when it is done calculating optimal memory sizing for the models. The following entries are logged to /var/log/vociserver/err.log
during the memory sizing process:
Decode server failed to start. Restarting in safe mode. Decode server successfully started in safe mode. Restarting.
Note
V‑Blaze attempts to adhere to the settings specified in models.list
, however, reductions may occur until all models are able to load.
models.list
If no models.list
configuration is provided, the system will attempt to load all available language models and their corresponding acoustic models. If licensing restricts certain installed models, a subset of licensed models will load instead. The following models.list
configuration limits the models loaded at startup to eng1:callcenter
and spa1:callcenter
(if available and licensed):
eng1:callcenter spa1:callcenter
Note
Models can expire even if they are loaded at startup by default.
If an installed model is excluded then the associated acoustic model won't load.
Language #config
parameter syntax (optional):
memfree=N
— Defines the minimum amount of system memory to keep available. The ASR engine will attempt to unload the least used language models until thememfree
specification is met.expire=H
— Hours of idle time after which a model should be automatically unloaded (default is to never unload because of idle time).all=no|demand|load
— The value defined inall
applies to all available language models. The following list explains each value:load
— Load the specified language model at startup (default).demand
— Load the specified language model if requested by a stream.no
— Never load the specified language model. If no model lines are provided, all models available on the system will be loaded (or made available on-demand withall=demand
)
General model parameter syntax: model [load|demand|no]
load
— Load the specified language model at startup (default).demand
— Load the specified language model if requested by a stream.no
— Never load the specified language model. If no model lines are provided, all models available on the system will be loaded (or made available on-demand withall=demand
)cycle=N
— The value defined in cycle (number of hours) specifies when to reload language models to optimize memory use.numgpus=N
— Define number of GPU processes for associated acoustic models. Acoustic processes are shared among all language models that use the same acoustic model. It is possible to use CPU processes instead of GPU processes by prefixing the value ofnumgpus
withc
to indicate the number of CPU processes.numgpus
is calculated using the following formula:roundup(min(maxdecoders,ncpus) / 4
roundup() — Round up if the final value is fractional
min() — Select the lowest of the two values
maxdecoders — Number of language model decoding processes to start
ncpus — Number of CPU cores
Sample models.list
The following is a sample models.list
file:
#config memfree=5 expire=12 all=demand 1 eng1:callcenter load 2 eng1:* demand 3
Important
The #config
statement is not a comment. It must be present to configure overall language model loading behavior.
The first line represents a global configuration command with the following settings:
| |
The second line instructs | |
The third line uses wildcard matching to instruct |
Complex models.list
Sample
The following sample is a more complex models.list
file:
#config memfree=5 expire=12 cycle=1 numgpus=2 eng1:callcenter load numgpus=2 eng2:callcenter demand numgpus=1 expire=2 eng3:callcenter demand numgpus=c2
The configuration in the sample above results in the following:
Leave 5 GB of system memory free.
Unload models after 12 hours idle time.
Reload language models every hour.
Load other acoustic models with 2 GPU processes.
Load
eng1:callcenter
at startup with 2 GPU processes for acoustic models.Load
eng2:callcenter
when requested with 1 GPU process for acoustic models and unload after 2 hours of idle time.Load
eng3:callcenter
when requested with 2 CPU processes for acoustic models.