V-Spark Online Help

Release Notes

V‑Spark 4.0.1

V‑Spark 4.0.1 is a major release with significantly changed architecture, along with numerous improvements and bug fixes.

  1. V‑Spark now runs on CentOS 7. Changes to V‑Spark’s technical foundation are primarily related to the CentOS 7 upgrade and are the main drivers for this release.

  2. V‑Spark services are now managed by systemd. SysVinit has been phased out and replaced with systemd as part of the upgrade to CentOS 7.

    1. This update removes init.d and intermediate scripts where possible, and introduces a hierarchy of individually managed services defined in unit files. The new service hierarchy can be represented as follows:

      vspark_services_diagram-small.png
    2. V‑Spark services can be controlled as a group by using the top-level vspark.service, which acts as a wrapper around the others. To start this service, use the following command:

      $ sudo systemctl start vspark
    3. The services started by vspark.service are specified in the Wants directive in the /usr/lib/systemd/system/vspark.service file. If vspark.service is enabled, it will be started automatically on boot and reboot, and any services in the Wants directive will also try to start. Any excluded services will not start.

    4. Each service can be started, stopped, and restarted separately as needed without invoking vspark.service using systemctl, as in these examples:

      $ sudo systemctl start vspark-back
      $ sudo systemctl start vspark-front 
      $ sudo systemctl start vspark-jobmgr

      Other options for systemctl vspark include the following:

      $ sudo systemctl restart vspark 
      $ sudo systemctl stop vspark 
      $ sudo systemctl status vspark
  3. V‑Spark now has a dedicated command line script for system administration. To use it, run sudo vspark-admin OPTION with any of the parameters in the following table.

    Table 1. Option parameters for vspark-admin

    Parameter

    Description

    status

    show the status of all V‑Spark services

    status OPTION

    show the status of a specific service; to specify that service, replace OPTION with one of these parameters: front, back, jobmgr, sccluster

    version

    display the current version of V‑Spark that is installed

    core-check

    show whether or not service dependencies are in allowed version range

    core-update

    shows available updates for V‑Spark dependencies; to apply applicable changes, invoke with the commit parameter -c as in this example:

    $ sudo vspark-admin core-update -c

    check-health

    shows status information about service dependencies

    show-config

    display a list of every config setting (represented in value pairs) that V‑Spark is using

    Note

    Any configuration changes made to a running version of V‑Spark must be reloaded for changes to take effect.



    For example, to check the status of V‑Spark's front-end services, run the following command:

    $ sudo vspark-admin status front
  4. V‑Spark now attempts to reconnect to service dependencies automatically. When certain required services become unavailable, V‑Spark will attempt to reconnect to them in order to minimize service disruption. By default, there is no limit to the number of times V‑Spark will attempt to reconnect.

  5. V‑Spark now uses improved logic to handle connectivity issues with the license server. When revalidation fails due to loss of network connectivity or other scenarios, V‑Spark will respect the time to live (TTL) value associated with the requested license. As long as the TTL with the license server hasn't expired, the license will stay validated in order to minimize service interruptions.

  6. V‑Spark now includes the voci-spark-tools package, which contains two utilities—Datatool and Config Manager—that facilitate the transfer of installation data and configuration settings.

    1. Datatool is a command-line tool for importing (load) and exporting (backup) audio and transcription data. For more details, see the Datatool.MD file in the utils/datatool directory.

    2. Config Manager is a script for importing and exporting installation configuration settings. This script can be found in the utils/ directory.

  7. Improved V‑Spark security to help protect against remote code executions. For more secure file handling, name validation rules apply to audio files and ZIP archives uploaded for transcription via the GUI and API, and also to files uploaded to the Audio Evaluator. Default naming rules forbid these characters: #*<>:?/\|{}$!'`"=^

    This setting is enabled by default. To disable it, set the filename_validation configuration setting to off. To define custom character requirements, specify a regular expression via filename_validation_pattern.

  8. Improved V‑Spark security to help protect against SQL injections.

  9. V‑Spark can now log which users have viewed the File Details page. To minimize excessive or unnecessary logging noise, this setting is off by default. To enable it, set the new configuration option audit_filedetails_pageviews to on. When enabled, an audit entry is logged in the Activity Log and recorded in server.log as an INFO entry.

  10. Links to HTML documentation in V‑Spark's help menu now point to docs.vocitec.com/V-Spark/. This update deprecates the release_note configuration option because the HTML version of the release notes has moved from the installation to the website. Note that the release_note_dl configuration still exists because PDF versions of documentation are still bundled with the software.

Announcements with V‑Spark 4.0.1

  1. System architecture has significantly changed with the V‑Spark 4.0.1 release. System administrators should take note of the new dependencies listed in the following table as they evaluate installation requirements.

    Table 1. Changes in V‑Spark dependencies from version 3.5.x to 4.0.1

    V-Spark 3.5.x

    V-Spark 4.0.1

    CentOS 6

    CentOS 7

    Node 6

    Node 10

    Elasticsearch 5.6

    Elasticsearch 7.6

    MySQL 5.1

    MariaDB 5.5 (EPEL repository)

    Redis 3.2

    Redis 3.2 (no changes, EPEL repository)



    Note

    Customers upgrading from a 3.4.3 or 3.5.x V‑Spark system running on CentOS 6 should contact customer support for recommended upgrade paths.

    Preliminary testing shows that enhancements in the 4.0.1 release lead to a 10% average performance improvement during data ingestion over previous versions.

  2. The location of some V‑Spark log files has changed. Some logged information now goes into CentOS7's journald and can be visualized using journalctl from the command line, as in the following example:

    $ sudo journalctl --unit=vspark-jobmgr

    This change keeps V‑Spark log files consistent with systemd service handling best practices. V‑Spark 4.0.1 log locations are listed in the following tables:

    Table 2. /var/log/vspark/

    Original Location

    New Location

    /var/log/vspark/front-err.log

    journal

    /var/log/vspark/back-err.log

    journal

    /var/log/vspark/backend_stdio.log

    No change

    /var/log/vspark/backendWorker.log

    No change

    /var/log/vspark/license.log

    No change

    /var/log/vspark/search.log

    No change

    /var/log/vspark/server.log

    No change



    Table 3. /var/log/vocijobmgr/

    Original Location

    New Location

    /var/log/vocijobmgr/init.err

    journal

    /var/log/vocijobmgr/init.out

    journal



    Table 4. /var/lib/vspark/managers/{company-org-folder}/logs/

    Original Location

    New Location

    *

    No change



  3. The voci-spark-hdfs-lib package is now optional, and it is no longer installed by default, as most installations don't require HDFS support.

Fixes in V‑Spark 4.0.1

The following issues have been resolved in the V‑Spark 4.0.1 release.

  1. Folder permissions were sometimes incorrectly set when using the API to create folders. This caused the API call to fail sporadically, and caused the folders to be hidden and unusable until the permissions were correctly set.

  2. Dashboard performance when viewing individual folders has been improved. Note that only single-folder dashboard performance was affected by this issue.

  3. V‑Spark services now automatically restart during hardware boot and reboot.

  4. Backend V‑Spark services now properly stop when Redis is down. Prior to this change, stopping the backend service while Redis was down could lead to errors that required killing backend processes manually in order to restart them.

  5. The status indicator of Folders' server status now consistently shows the red or green indicator independently of user role.

  6. The call volume data display on the Overview Dashboard now behaves more consistently. Previously, page refreshes would show either the monthly view or the 31-day view at random. The dashboard now shows either the monthly or the 31-day view consistently across subsequent page loads, depending on the last selected.

  7. Organizations created via the /config endpoint will now use the "US/Eastern" time zone if none is specified in the initial API call. Previously, no time zone would be assigned to the organization. This would eventually cause the UI to fail to render some pages, as certain pages expect time zone information to exist for organizations.

Maintenance Fixes Since the Initial 4.0.1 Release

This section describes the fixes included in the latest version of the 4.0.1 release. Maintenance versions and associated fixes are grouped by release version and listed most recent to earliest.

V‑Spark 4.0.1-4 Fixes

  1. Addressed a continuous memory growth issue affecting Job Manager processes. A new V‑Spark option (jm_memory_threshold) has been added that specifies the memory threshold in bytes for Job Manager folder workers.

    When the worker process exceeds the memory threshold, it is internally paused until the current jobs finish. It then terminates and is replaced by a new Job Manager folder worker process.

  2. Fixed an issue that caused file scoring to fail without retry if Elasticsearch failed to create certain temporary indices.

  3. Fixed a file permission issue with the vspark-admin script's core-check option. The script now reports the correct filesystem version when executing after a clean V‑Spark install.

  4. Fixed V‑Spark's integration with ASR servers that use a Python API. Prior to this fix, when companies were configured to use an ASR server's Python API, files would get stuck in the processing queue. This was a rare edge case.

Known Issues in V‑Spark 4.0.1

  1. We are monitoring a memory leak in the Job Manager service. The amount of memory leaked may increase if the installation has a large number of busy or active folders. Restarting the Job Manager service temporarily addresses the issue.

    As of the 4.0.1-4 release, we have implemented a workaround solution that restarts the leaking Job Manager process. We are continuing to monitor the issue.

  2. Audio files ingested with incorrectly formatted JSON metadata files may be imported without their expected metadata fields. These files are flagged as BAD METADATA, but are still imported, transcribed, and analyzed, which could lead to files being imported without expected metadata.

  3. Application changes may not display in real time when made by another user from a different host. Although application editing works, users editing an application simultaneously from different hosts must refresh the Application Editor page to see changes made by another user. This issue does not typically occur when both users are being served by the same host.