V-Spark Online Help

V‑Spark 4.0

V‑Spark 4.0 is a major release with significantly changed architecture, along with numerous improvements and bug fixes.

  1. V‑Spark now runs on CentOS 7. Changes to V‑Spark’s technical foundation are primarily related to the CentOS 7 upgrade and are the main drivers for this release.

  2. V‑Spark services are now managed by systemd. SysVinit has been phased out and replaced with systemd as part of the upgrade to CentOS 7.

    1. This update removes init.d and intermediate scripts where possible, and introduces a hierarchy of individually managed services defined in unit files. The new service hierarchy can be represented as follows:

      vspark_services_diagram-small.png
    2. V‑Spark services can be controlled as a group by using the top-level vspark.service, which acts as a wrapper around the others. To start this service, use the following command:

      $ sudo systemctl start vspark
    3. The services started by vspark.service are specified in the Wants directive in the /usr/lib/systemd/system/vspark.service file. If vspark.service is enabled, it will be started automatically on boot and reboot, and any services in the Wants directive will also try to start. Any excluded services will not start.

    4. Each service can be started, stopped, and restarted separately as needed without invoking vspark.service using systemctl, as in these examples:

      $ sudo systemctl start vspark-back
      $ sudo systemctl start vspark-front 
      $ sudo systemctl start vspark-jobmgr

      Other options for systemctl vspark include the following:

      $ sudo systemctl restart vspark 
      $ sudo systemctl stop vspark 
      $ sudo systemctl status vspark
  3. V‑Spark now has a dedicated command line script for system administration. To use it, run sudo vspark-admin OPTION with any of the parameters in the following table.

    Table 1. Option parameters for vspark-admin

    Parameter

    Description

    status

    Show the status of all V‑Spark services.

    status OPTION

    Show the status of a specific service.

    To specify that service, replace OPTION with one of these parameters: front, back, jobmgr, sccluster

    version

    Display the current version of V‑Spark that is installed.

    core-check

    Shows whether or not service dependencies are in allowed version range.

    core-update

    Shows and applies available updates for V‑Spark dependencies.

    To preview required changes to core schemas and data structures, invoke core-update as in this example:

    $ sudo vspark-admin core-update

    To apply required changes to core schemas and data structures, invoke core-update with the commit parameter -c as in this example:

    $ sudo vspark-admin core-update -c

    check-health

    Shows status information about service dependencies.

    show-config

    Display a list of every config setting (represented in value pairs) that V‑Spark is using.

    Note

    Any configuration changes made to a running version of V‑Spark must be reloaded for changes to take effect.



    For example, to check the status of V‑Spark's front-end services, run the following command:

    $ sudo vspark-admin status front
  4. V‑Spark now attempts to reconnect to service dependencies automatically. When certain required services become unavailable, V‑Spark will attempt to reconnect to them in order to minimize service disruption. By default, there is no limit to the number of times V‑Spark will attempt to reconnect.

  5. V‑Spark now uses improved logic to handle connectivity issues with the license server. When revalidation fails due to loss of network connectivity or other scenarios, V‑Spark will respect the time to live (TTL) value associated with the requested license. As long as the TTL with the license server hasn't expired, the license will stay validated in order to minimize service interruptions.

  6. V‑Spark now includes the voci-spark-tools package, which contains two utilities—Datatool and Config Manager—that facilitate the transfer of installation data and configuration settings.

    1. Datatool is a command-line tool for importing (load) and exporting (backup) audio and transcription data. For more details, see the Datatool.MD file in the utils/datatool directory.

    2. Config Manager is a script for importing and exporting installation configuration settings. This script can be found in the utils/ directory.

  7. Improved V‑Spark security to help protect against remote code executions. For more secure file handling, name validation rules apply to audio files and ZIP archives uploaded for transcription via the GUI and API, and also to files uploaded to the Audio Evaluator. Files inside a zip are not checked. This feature was implemented with release version 4.0.1-3. Default naming rules forbid these characters: #*<>:?/\|{}$!'`"=^

    Filename validation is enabled by default. To disable it, set the filename_validation system configuration setting to off. To define custom character requirements, specify a regular expression via filename_validation_pattern.

  8. Improved V‑Spark security to help protect against SQL injections.

  9. V‑Spark can now log which users have viewed the File Details page. To minimize excessive or unnecessary logging noise, this setting is off by default. To enable it, set the new system configuration option audit_filedetails_pageviews to on. When enabled, an audit entry is logged in the Activity Log and recorded in server.log as an INFO entry.

  10. Links to HTML documentation in V‑Spark's help menu now point to docs.vocitec.com/V-Spark/. This update deprecates the release_note configuration option because the HTML version of the release notes has moved from the installation to the website. Note that the release_note_dl configuration still exists because PDF versions of documentation are still bundled with the software.

Announcements with V‑Spark 4.0

  1. System architecture has significantly changed with the V‑Spark 4.0 release. System administrators should take note of the new front-end libraries and other dependencies listed in the following table as they evaluate installation requirements.

    Table 1. Changes in V‑Spark dependencies from version 3.5 to 4.0

    Dependency

    Version used with V‑Spark 3.5

    Version used with V‑Spark 4.0

    CentOS

    6

    7

    Node

    6

    14

    Elasticsearch

    5.6

    7.6

    MySQL and MariaDB

    MySQL 5.1

    MariaDB 5.5 (EPEL repository)

    Redis

    3.2

    3.2 (no changes, EPEL repository)

    Moment.js

    2.17.1

    2.29.1

    Bootstrap

    3.2.0

    3.4.1

    jQuery

    2.2.4

    3.6.0



    Note

    Customers upgrading from a 3.4.3 or 3.5 V‑Spark system running on CentOS 6 should contact customer support for recommended upgrade paths.

    Preliminary testing shows that enhancements in the 4.0 release lead to a 10% average performance improvement during data ingestion over previous versions.

  2. The location of some V‑Spark log files has changed. Some logged information now goes into CentOS 7's journald and can be visualized using journalctl from the command line, as in the following example:

    $ sudo journalctl --unit=vspark-jobmgr

    This change keeps V‑Spark log files consistent with systemd service handling best practices. V‑Spark 4.0 log locations are listed in the following tables:

    Table 2. /var/log/vspark/

    Original Location

    New Location

    /var/log/vspark/front-err.log

    journal

    /var/log/vspark/back-err.log

    journal

    /var/log/vspark/backend_stdio.log

    No change

    /var/log/vspark/backendWorker.log

    No change

    /var/log/vspark/license.log

    No change

    /var/log/vspark/search.log

    No change

    /var/log/vspark/server.log

    No change



    Table 3. /var/log/vocijobmgr/

    Original Location

    New Location

    /var/log/vocijobmgr/init.err

    journal

    /var/log/vocijobmgr/init.out

    journal



    Table 4. /var/lib/vspark/managers/{company-org-folder}/logs/

    Original Location

    New Location

    *

    No change



  3. The voci-spark-hdfs-lib package is now optional, and it is no longer installed by default, as most installations don't require HDFS support.

Fixes in V‑Spark 4.0

The following issues have been resolved in the V‑Spark 4.0 release.

V‑Spark 4.0.2 Fixes

  1. Action buttons will no longer appear enabled until the required form criteria are met. Previously, Login and Upload buttons would appear to be active even though using them would cause errors.

  2. Resolved an issue with error and warning logging that caused some details to be discarded, or to be spread out across multiple lines. Log messages for most system errors and warnings now include more complete and consistent detail.

V‑Spark 4.0.1 Fixes

  1. Folder permissions were sometimes incorrectly set when using the API to create folders. This caused the API call to fail sporadically, and caused the folders to be hidden and unusable until the permissions were correctly set.

  2. Dashboard performance when viewing individual folders has been improved. Note that only single-folder dashboard performance was affected by this issue.

  3. V‑Spark services now automatically restart during hardware boot and reboot.

  4. Backend V‑Spark services now properly stop when Redis is down. Prior to this change, stopping the backend service while Redis was down could lead to errors that required killing backend processes manually in order to restart them.

  5. The status indicator of Folders' server status now consistently shows the red or green indicator independently of user role.

  6. The call volume data display on the Overview Dashboard now behaves more consistently. Previously, page refreshes would show either the monthly view or the 31-day view at random. The dashboard now shows either the monthly or the 31-day view consistently across subsequent page loads, depending on the last selected.

  7. Organizations created via the /config endpoint will now use the "US/Eastern" time zone if none is specified in the initial API call. Previously, no time zone would be assigned to the organization. This would eventually cause the UI to fail to render some pages, as certain pages expect time zone information to exist for organizations.

Maintenance Updates Since the Initial 4.0 Release

This section describes the bug fixes and other enhancements included in the latest version of the 4.0 release. Maintenance versions and associated changes are grouped by release version and listed most recent to earliest.

Important

Any updates to V‑Spark 4.x from versions earlier than 4.0.1-8 require running the core-update script before starting the upgraded installation. See the core-update row in release note 3 for instructions for running this script.

V‑Spark 4.0.2-1 Updates

  1. System architecture has significantly changed with the V‑Spark 4.0.2 release. These front-end library updates are the primary driver for the 4.0.2-1 release. System administrators should take note of the new libraries listed in the following table:

    Table 1. Changes in V‑Spark front-end libraries from version 4.0.1 to 4.0.2

    Dependency

    Version used with V‑Spark 4.0.1

    Version used with V‑Spark 4.0.2

    Node

    10.19.0

    14.15.0

    Moment.js

    2.17.1

    2.29.1

    Bootstrap

    3.2.0

    3.4.1

    jQuery

    2.2.4

    3.6.0



  2. Overhauled the layout of the System Status page to emphasize the most significant points of data flow in the system. The following changes have been made:

    • System information has been divided into the System Status and Processing & Performance tabs.

    • The System Status tab displays host hardware and installation attributes, along with uptimes, status information, and usage for Elasticsearch, Redis, SQL, and local storage.

    • The Processing & Performance tab displays queue status and folder activity metrics. This information includes the following:

      • ASR and analytics queue and processing status for audio calls

      • Time to ASR and transcript analysis completion (in seconds) over the last 15 minutes, the last hour, the last 4 hours, the last day, and the last week

      • Two lists of folders: one that displays folders by most calls submitted to the system, and another that displays folders by most calls in the ASR queue

      • A sortable table of folder activity that displays the number of files ingested, the duration of that folder's audio (in seconds), the number of files completed, and the last active date for all system folders that can be filtered by various time periods

  3. Improved response handling for systems using V‑Cloud. Eliminated extraneous 503 errors generated when communicating with V‑Cloud to minimize unnecessary noise in Job Manager log files.

  4. Updated the system password policy with more rigorous requirements. The following policy changes have been made:

    • A user must provide the current password to change the account's email address.

    • A user may not change the account password to any of the previous four used.

    • Passwords must contain both letters and numbers.

  5. V‑Spark now supports the secure attribute for sensitive cookies. To use cookies with the secure attribute, set the new system configuration option secure_cookies to on.

    Warning

    Enabling secure_cookies on a non-HTTPS system will make the system unusable.

  6. Applied additional security updates, including the following:

    • Replaced feedback message on the Forgot Password page to obfuscate whether the provided email address exists in the system.

    • Addressed all cross-site scripting vulnerabilities in the system discovered up to this point.

V‑Spark 4.0.1-10 Updates

  1. Added new configuration file options to support encrypted TLS/SSL database connections. The following options have been added:

    • db_ssl

    • db_ssl_ca

    • db_ssl_cert

    • db_ssl_key

    • db_ssl_enable_azure_connections

  2. Improved logic for handling Job Manager processes exhibiting continuous memory growth. V‑Spark now waits a maximum of 10 minutes for in-progress jobs to finish before terminating and replacing a worker process whose memory usage exceeds the configured threshold.

V‑Spark 4.0.1-9 Updates

  1. Added folder activity metrics to the System Status page. Folder activity metrics may be limited to certain time ranges using the time selector. The following folder metric fields have been added:

    Ingestion Volume

    Displays the total number of audio and metadata files added to Elasticsearch and assigned a transcriptID in each listed folder. Files deleted after ingestion are not represented in this count.

    Throughput

    Displays the total audio time, measured in seconds, of all files in each listed folder.

    Analysis Completion

    Displays the total number of files analyzed as counted by transcriptID in each listed folder.

    The System Status page is in active development, and additional metrics will become available in subsequent releases. All available metrics are documented in the System and Performance Monitoring section of the V‑Spark Management Guide.

  2. Applied additional security updates to enhance application resiliency.

  3. Fixed an issue that caused Chrome to auto-fill the folder page search box with the current username.

V‑Spark 4.0.1-8 Updates

  1. Added system summary and performance metrics sections to the System Status page.

    Overall Metrics shows the host system's CPU specifications, memory and storage usage, and information about front- and back-end services.

    Total Jobs shows the number of jobs the system is currently processing.

    Time to Analysis Completion shows the minimum, average, and maximum analysis completion times over various increments.

  2. Changed a default Elasticsearch setting to allow for greater application complexity. This issue is identified by an Elasticsearch error in search.log containing a single numeric value without any additional context. The incomplete error message was caused by a faulty truncation of the full Elasticsearch error message, which could lead to difficulty diagnosing the root cause.

V‑Spark 4.0.1-7 Updates

  1. Applied additional security updates to enhance application resiliency.

  2. Enabled a new System Status page. for system administrator user accounts accessible using the System option in the Settings menu. The first iteration of the status page shows

    • the names of active and queued folders,

    • the number of active and queued transcript analysis requests, and

    • the number of active and queued callbacks.

V‑Spark 4.0.1-6 Updates

  1. Applied multiple security updates to enhance application resiliency.

  2. Improved error logging in server.log to include SMTP configuration errors when applicable.

  3. Defined unzip as a package dependency so that it no longer needs to be installed manually on a host system.

V‑Spark 4.0.1-5 Updates

  1. Addressed a continuous memory growth issue affecting Job Manager processes. A new V‑Spark option (jm_memory_threshold) has been added that specifies the memory threshold in bytes for Job Manager folder workers.

    When the worker process exceeds the memory threshold, it is internally paused until the current jobs finish. It then terminates and is replaced by a new Job Manager folder worker process.

  2. Fixed an issue that caused file scoring to fail without retry if Elasticsearch failed to create certain temporary indices.

  3. Fixed a file permission issue with the vspark-admin script's core-check option. The script now reports the correct filesystem version when executing after a clean V‑Spark install.

  4. Fixed V‑Spark's integration with ASR servers that use a Python API. Prior to this fix, when companies were configured to use an ASR server's Python API, files would get stuck in the processing queue. This was a rare edge case.

  5. Fixed an issue that prevented updating an organization's retention policy via the UI. Prior to this fix, the retention policy could not be updated in certain scenarios. Any upgrade from a previous release to 4.0.1-5 requires running core-update to fully address this issue.

Known Issues in V‑Spark 4.0

  1. Audio files ingested with incorrectly formatted JSON metadata files may be imported without their expected metadata fields. These files are flagged as BAD METADATA, but are still imported, transcribed, and analyzed, which could lead to files being imported without expected metadata.

  2. Application changes may not display in real time when made by another user from a different host. Although application editing works, users editing an application simultaneously from different hosts must refresh the Application Editor page to see changes made by another user. This issue does not typically occur when both users are being served by the same host.