Processing and performance
Voci's ASR performance, also known as throughput, is measured by the total duration of audio transcribed per hour. The specifications of the system V‑Blaze operates on has a significant impact on throughput. For example, V‑Blaze has higher throughput running on a system with a 16 core CPU, Nvidia GPU module, and 128 GB RAM than it would on a system with an 8 core CPU and no GPU.
Throughput varies based on language models, optional ASR features, and audio characteristics. Concurrent use of multiple models and languages on a single instance is supported, but the available model configuration must match the resources available (primarily RAM and GPU memory).
V‑Blaze runs on wide variety of configurations. However, the primary factors for choosing a deployment configuration are operational cost and availability. Calculate operational costs by taking the system cost (price per hour) and divide it by the observed throughput (duration of audio transcribed per hour).
For more information on processing and performance, refer to V‑Blaze standard system configurations or contact support@vocitec.com to determine the most suitable options for your needs.
Storage
V‑Blaze does not store any request data; therefore, system storage requirements are minimal and limited only to the software and models. System storage space required for V‑Blaze software is less than 10 GB. Additionally, each ASR model can require up to 5 GB of storage. Typical system configurations range from 20 GB to 250 GB. SSD storage is recommended for quicker boot and load times.
Using V‑Blaze with AWS
Voci support recommends using GPU-enhanced instances to deploy Voci V‑Blaze on AWS. The g4.4xlarge instance type provides optimal price, performance, and scalability for Voci V‑Blaze, and g3.4xlarge is also supported.
For non-GPU instances, optimal sizing depends on other factors like the number of language models installed. Contact support@vocitec.com for recommendations.
V‑Blaze standard system configurations
Configuration | Nominal Throughput (audio processed per hour) | System Type | AWS instance type | Optional Configuration | Notes |
---|---|---|---|---|---|
V‑Blaze — AWS standard | 200 hrs | Virtual (GPU) | g3.4xlarge, g4dn.4xlarge, g5.4xlarge | GPU hosts can process approximately 200 hours of audio per hour. | |
V‑Blaze — AWS minimal development | 50 hrs | Virtual (no GPU) | m4.2xlarge, m5.2xlarge, m6.2xlarge, m7.2xlarge | Non-GPU hosts can process approximately 50 hours of audio per hour. This is a cost-effective host for development with customer data. |
Configuration | Nominal Throughput (audio processed per hour) | System Type | Hardware Specifications | Optional Configuration | Notes |
---|---|---|---|---|---|
V‑Blaze — High Volume | 1000+ hrs | 1U Server | CPU: 2x Intel Xeon Gold 6248R GPU: 1x NVIDIA A100 (40 GB) RAM: 384 GB Storage: 2x 250 GB SSD in RAID1 | 2x 2TB HDD | Additional storage is necessary if a single ASR host is used for Direct-to-Transcript, V‑Spark Analytics, or audio storage. |
V‑Blaze — Low Volume | 350 hrs | 1U Server | CPU: 2x Intel Xeon Gold 6226R GPU:1x NVIDIA T4 (16 GB) RAM: 192 GB Storage: 2x 250GB SSD in RAID1 | 2x 2TB HDD | Additional storage is necessary if a single ASR host is used for Direct-to-Transcript, V‑Spark Analytics, or audio storage. |
V‑Blaze — AWS Standard | 200 hrs | EC2 g4dn.4xlarge | CPU: 16x vCPU (Intel Xeon E5-2686 v4 Cascade Lake) GPU: 1x NVIDIA T4 (16 GB) RAM: 64 GB Storage: 20 GB EBS gp2 | Alternate AWS instances include g3.4xlarge or any larger g3/g4 instance. Voci recommends .4xl instances for the best value and scaling. | |
V‑Blaze — Minimal Virtual (no GPU) | 50 hrs | EC2 m5.2xlarge or equivalent VM instance | CPU: 8x vCPU (Intel Xeon Platinum 8259CL) GPU: N/A RAM: 32 Storage: 20 GB EBS gp2 | Voci does not recommend this configuration for production. However, a nominal throughput of up to 50 hours of audio per hour is possible. |