Keyword expression syntax
MiaRec supports very powerful expression syntax for matching keywords / phrases in text.
The following operators are supported:
- AND
- OR
- &
- |
- NOT
- NOTIN
- NEAR
- NOTNEAR
- ONEAR
- MATCHES
Additionally, it supports:
- parentheses "(" and ")" for grouping.
- Wildcard
Boolean operators (AND, OR, NOT, & and |)
Expression | Description |
---|---|
quick OR brown | matches "quick fox" and "brown fox" |
quick AND fox | matches "quick fox" |
NOT brown AND fox | matches "quick fox" but not "brown fox" |
Synonyms & and I
Symbols &
and |
are synonyms to AND
and OR
respectively.
Expression | Equivalent form |
---|---|
quick | brown | quick OR brown |
quick & fox | quick AND fox |
(quick | brown | grey ) & fox | (quick OR brown OR grey) AND fox |
When using |
and &
symbols, a space charter between words is optional. The following are valid expressions:
- (quick | brown | grey ) & fox
- (quick|brown|grey)\&fox
Case in operator names
A case in operator name is important. AND
is treated as an operator, while and
is treated literally as a word and
in text "what a beautiful and amazing day".
Order of term match
An order of matched words is not taken into account. If an order is important, then use ONEAR
operator.
Expression | Description |
---|---|
quick AND fox | matches both "quick fox" and "fox quick" |
Distance between matched words
A distance between words is not taken into account. Use operators quoted term or NEAR
, ONEAR
and NOTNEAR
if a distance is important.
Expression | Description |
---|---|
quick AND fox | matches both "quick fox" and "quick dog was chasing a fox" |
"quick fox" | matches "dog was chasing a quick fox" bot not "quick dog was chasing a fox" |
quick NEAR:3 fox | matches "quick fox" and "quick and cute fox" but not "quick dog was chasing a fox", because of a distance between quick and fox words is more than 3 words. |
Quoted term
Use quotes ("
) to search for an exact phrase.
Expression | Description |
---|---|
"quick fox" | matches "dog was chasing a quick fox" bot not "quick dog was chasing a fox" |
quick fox | matches both "quick fox" and "quick dog was chasing a fox", because of such expression is equivalent to "quick NEAR:5 fox", i.e. words quick and fox can be found up to 5 words in distance between each other. |
Wildcards
A wildcard character is used to substitute one or more characters in a string.
Supported wildcard characters:
Symbol | Description | Example |
---|---|---|
* | Represents zero or more characters | bl* finds bl, black, blue, and blob |
? | Represents a single character | h?t finds hot, hat, and hit |
[ ] | Represents any single character within the brackets | h[oa]t finds hot and hat, but not hit |
! | Represents any character not in the brackets | h[!oa]t finds hit, but not hot and hat |
- | Represents a range of characters | c[a-b]t finds cat and cbt |
To use one wildcard characters literally, escape it with \
symbol. For example yes\! expression will find yes! text.
Grouping
Multiple terms or clauses can be grouped together with parentheses, to form sub-queries:
Examples:
- (quick OR brown) AND fox
- (quick | brown | grey) fox
- NOT brown AND (fox OR dog)
Proximity operators (NEAR, ONEAR, NOTNEAR, NOTIN)
Proximity operators allow you to locate one searched term within a certain distance of another.
NEAR[:x]
Finds the phrase where the terms joined by the operator are within the specified number of words of each other. \ Where x is a maximum distance between the searched terms.
Note 1. A distance parameter is optional. If omitted, a default distance of 5 is used, i.e. NEAR is equivalent to NEAR:5
Note 2. An order of the found terms is not taken into account, i.e. brown NEAR fox
will match both "dog is chasing brown fox" and "fox is chasing brown dog".
Note 3. When chaining multiple operators, then parentheses must be used when a distance is not the same. For example, expressions brown NEAR quick NEAR fox
and brown NEAR:2 quick NEAR:2 fox
are both valid, but brown NEAR:2 quick NEAR:5 fox
is not valid because a distance is 2 in one case and 5 in another. Parentheses must be added to make such expression valid: (brown NEAR:2 quick) NEAR:5 fox
Expression | Description |
---|---|
cancel* NEAR order | Matches "cancel my order", "order is cancelled", but not "cancel my account and then place an order", because of a distance between cancel and order in the last example is more than default 5 words. |
cancel* NEAR:1 order | Matches "cancel order", but not "cancel my order" because of distance between words is more than requested (1). |
brown NEAR quick NEAR fox | Matches "brown and quick fox", but not "brown fox" |
ONEAR[:x]
Similar to NEAR operator, but an order of the matched terms is taken into account. For example brown NEAR fox
will match "brown fox" but not "fox brown".
Expression | Description |
---|---|
cancel* ONEAR order | Matches "cancel my order" but not "order is cancelled", because of the order of terms is important. |
Note 1. A distance parameter is optional. If omitted, a default distance of 5 is used, i.e. ONEAR is equivalent to ONEAR:5
Note 2. When chaining multiple operators, then parentheses must be used when a distance is not the same. For example, expressions brown ONEAR quick ONEAR fox
and brown ONEAR:2 quick ONEAR:2 fox
are both valid, but brown ONEAR:2 quick ONEAR:5 fox
is not valid because a distance is 2 in one case and 5 in another. Parentheses must be added to make such expression valid: (brown ONEAR:2 quick) ONEAR:5 fox
NOTNEAR[:x]
Syntax:
<term-1> NOTNEAR[:x] <term-2>
Operator NOTNEAR finds the term on the left side of the operator (\
Expression | Description |
---|---|
cancel* NOTNEAR account | Matches "cancel order", "order is cancelled", but neither "cancel my account" not "this account is cancelled". |
cancel* NOTNEAR:1 account | Matches "cancel my bank account" but not "cancel account", because of a required distance between terms is maximum 1. |
Note 1. A distance parameter is optional. If omitted, a default distance of 5 is used, i.e. NOTNEAR is equivalent to NOTNEAR:5
Note 2. An order of the found terms is not taken into account, i.e. cancel* NOTNEAR account
will exclude both "cancel account" and "account canceled"
Note 3. A chaining of operator NOTNEAR is not supported. Use parentheses to specifically group multiple expressions. For example, cancel* NOTNEAR bank* NOTNEAR account
must be rewritten as cancel* NOTNEAR (bank* NOTNEAR account)
NOTIN
Operator NOTIN allows to match terms that are not part of a longer term. For example, you would like to find word "problem", but not when it is part of phrase "not a problem".
Examples:
- problem NOTIN "not a problem"
- problem NOTIN "no problem"
- problem NOTIN ((no|not) ONEAR problem)
- problem NOTIN no* ONEAR:2 problem
Count occurrences (MATCHES)
Operator MATCHES finds the term, that occurs the requested number of times in a text. For example, it can be used to find phrase where at least 8 digits are spoken.
Syntax:
<term> MATCHES: N[-M]
Where:
- \
is a search expression, which can be a word, phrase or a complex expression like (brown | quick)
- N is a minimum number of occurrences of the term in the text
- M is a maximum number of occurrences of the term in the text. If omitted, then maximum M is equal to N, i.e
brown MATCHES: 2
is the same asbrown MATCHES: 2-2
Expression | Description |
---|---|
(one|two|three|four|five|six|seven|eight|nine|ten) MATCHES: 5-10 | Matches "one two two four five seven" but not "one two" |
Precedence rules
When no parentheses are present, then the operators are evaluated in the following order:
- NOTNEAR
- ONEAR
- NEAR
- NOTIN
- MATCHES
- NOT
- AND
- OR
Expression | Equivalent form |
---|---|
quick OR brown AND fox | quick OR (brown AND fox) |
quick NEAR brown AND fox | (quick NEAR brown) AND fox |
Default operator
If no operator is included between terms, then a default NEAR
operator is used:
Expression | Equivalent form |
---|---|
brown fox | brown NEAR fox |
(quick OR brown) fox | (quick OR brown) NEAR fox |
quick OR brown fox | quick OR (brown NEAR fox) |
Note, NEAR
operator has a higher priority than OR
and AND
(see Precedence rules section).
Combining operators
Multiple operators can be combined together to form a complex expression.
Expression | Description |
---|---|
cancel* NEAR (order|account) | matches "cancel order", "order is cancelled", "I am cancelling my account", "I want to cancel this order" |
definitely NOTIN "definitely not" | matches " |