V-Spark Online Help

Regular Expression Search

Regular expression (regex) queries search text using pattern matching. V‑Spark regex syntax is briefly described below. Regex searches can be performed in the Dashboard File View by selecting the Regex option for search text. Regex queries in V‑Spark will only match whole terms as with plain text queries. There are no partial matches.

Note

Regex queries operate on individual terms and cannot be used to match multi-word phrases. For each regex query, the search engine scans the list of terms in the inverted index to find all matching terms. It then retrieves all documents for each term.

This means that running a regex query that matches many unique terms can be very resource intensive. Users should avoid using a pattern that starts with a wildcard (for example, *.foo).

For more information on regex syntax as used in V‑Spark, refer to the documentation for Elasticsearch 1.4.

Anchoring

Most regex search engines will match any part of a word. In these cases ^ and $ are used to anchor searches to the beginning and end of a word, respectively. However, since V‑Spark regex searches will only match whole words, these special anchors are not required and not valid except as literal characters. As an example, for the word "abcde":

ab.* # match
abcd # no match
^abcd # no match
abcd$ # no match
Allowed Characters

Any Unicode characters may be used in the pattern, but certain characters are reserved. The reserved characters are:

. ? + * | { } [ ] ( ) # @ & < > ~" \

Note

^ and $ are not reserved characters.