V-Blaze and V-Cloud Online Help

Default Redaction File — scrub.conf

Note

V‑Cloud users cannot use custom redaction files. Contact support@vocitec.com for more information.

Use a text editor to replicate the default redaction file. The text editor must be capable of saving in plain text. Freely available text editors such as Emacs, Vim, Nano, and Notepad++ work best as they save in plain text by default.

The redaction file must conform to standard JSON formatting and regular expression matching. Refer to JSON Structures for more information on JSON formatting. Additionally, the file must include the default rules and exclusions in the sample below. Include custom redaction rules and exclusions after the defaults.

The following sample is the default redaction file that is automatically applied unless specified otherwise:

[
  {
    "README": "DO NOT EDIT/REMOVE THIS FILE - USER MODIFICATIONS SHOULD BE MADE IN /opt/voci/state/scrub.conf",
    "description": "README",
    "regex": "^$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "always scrub audio for any characters surrounded by double-octothorpes (via substitutions)",
    "regex": "^((¿)?)##(.+)##([.,?]?)$",
    "repl": "\\1\\3\\4",
    "tospace": "_"
  },
  {
    "description": "exclude any characters surrounded by double-atsymbols (via substitutions)",
    "regex": "^((¿)?)@@(.+)@@([.,?]?)$",
    "repl": "\\1\\3\\4",
    "tospace": "_",
    "report": false,
    "audio" : false
  },
  {
    "description": "Exclude words that include non-digits other than punctuation",
    "regex": "[^-+$%:0-9.,?]",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "exclude ordinal numbers from scrubbing",
    "regex": "^(¿)?(\\d+/)?\\d+(st|nd|rd|th|ᵒ|ᵃ|e|er|re)[.,?]?$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "exclude percentages from scrubbing",
    "regex": "^(¿)?(\\d+[.,])?\\d+%[.,?]?$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "exclude clock times from scrubbing",
    "regex": "^(¿)?([1-9]|10|11|12):[0-5][0-9]( [AP]M)?[.,?]?$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "exclude prices from scrubbing",
    "regex": "^(¿)?([\\d,. ]+(R?\\$|€)|(R?\\$|€)[\\d,. ]+)[.,?]?$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "exclude short floating point numbers (w/decimal point) from scrubbing",
    "regex": "^(¿)?\\d{1,4}[.,]\\d{1,4}[.,?]?$",
    "repl": "",
    "text": false,
    "report": false,
    "audio": false
  },
  {
    "description": "replace all other digits with #",
    "regex": "\\d",
    "repl": "#"
  }
]

Important

If an /opt/voci/state/scrub.conf file exists, the default redaction configuration distributed with the ASR Engine will be ignored completely.