Semantic Rule Language

This is the reference for the semantic rule language in Vespa. For a guide on using this language, see query rewriting. Refer to the Search API for how to use in queries.

Rule bases

Semantic rules are collected in files called rule bases. The name of these files are [rule-base-name].sr They must be placed in [application-package]/rules/ to be deployed.

Basic syntax

A rule base may contain any number of the following four constructs, explained in the rest of this document:

Production rules and named conditions are statements. Statements may span multiple lines and are terminated by ;.

Directives

A directive is a "meta-level" statement which is not used during rule evaluation, but tells the rule engine how to use the rule base. A statement starts by @ and ends by newline. They may take parameters. These directives exists:

StatementUsageLocation
@default Make this rule base the default, to be used with all queries Anywhere outside other statements
@automata(<automata-filename>) Use an automata file with this base Anywhere outside other statements
@include(<rulebase-name>) Include all the statements of another rule base in this Anywhere outside other statements
@super Include the conditions of the same-named conditions from the included rule base In a condition

Production Rules

A production rule is of the form

<condition> <operator> <production-list>;

This performs the production as defined by the operator if the condition matches. There are two kinds of production rules (and two operators), replacing and adding:

Rule kind Operator Meaning
Replacing -> Replace the matched terms by the production

Namespaces

A namespace is a collection of facts which can be read from conditions and changed by productions. Namespaces may be positional (sequences), or not. A positional namespace will track the current fact and match and insert at the current position, while non-positional namespaces will match any fact against any condition.

There is a default namespace which does not need an explicit reference. For query rules, the default namespace is the query terms.

To determine the namespace used to read from conditions or change in productions, use

<namespace>.<condition>
<namespace>.<production>

There are two namespaces defined during query processing:

Namespace Syntax Positional Description
Query Yes The default namespace. References the terms of the query. The condition value returned will be the term itself.
Parameter parameter. No References the parameter of the query. Conditions will be true if the parameter is set in the query. The value returned from conditions is the value of the parameter. Productions will need both a key and value specified to set a parameter value.

Named Conditions

A named condition is on the form

[condition-name] :- <condition>;

This simply assigns a name to the condition on the right, so it can be referred to the conditions in rules and other named conditions.

Conditions

A condition is an expression which evaluates to true or false over the facts of a namespace. If the namespace is positional (a sequence) evaluation starts at the current position in the namespace. When evaluated true, conditions will also return a value which can be referenced by comparison conditions.

Conditions may be preceded by a reference name and a label:

(<reference-name>/)?(<label>:)?<condition>


Reference Name

The reference name allow an explicit name to be set from which the terms matched by the condition can be referred in a condition. This is useful when multiple conditions of the same type are used in the condition of the same rule.

If no reference name is given, the text standing between the square brackets of the condition is used as reference name.

Label

If a label is specified, the condition will only match terms having that label (the label is the index in query terms). If a label is not set, the term will match if a label is not set, or if it is default.

Condition

These are the supported kinds of conditions:

Condition Syntax Meaning Returned value
Term <term> True if this is the term at the current position Determined by the namespace
Reference [<condition-name>] Evaluate a named condition The matched term(s)
Sequence <condition> <condition> Match both conditions by consecutive terms in the right order in the sequence The last nested condition value
Choice <condition>, <condition> Match any one of the conditions, each one tried at the current position The last nested condition value
Group (<condition>) Evaluate the condition inside the grouping as a unit The last nested condition value
Ellipsis Matches any sequence to make the overall condition match The matched sequence
Referable ellipsis […] An ellipsis where the matched sequence can be referenced from the production The matched sequence
Not !<condition> Matches if the condition does not match Nothing
And <condition> & <condition> Matches if all the conditions matches at the (same) current position The last nested condition value
Comparison <condition> <operator> <condition> True if the comparison is true for the values returned from the conditions The last nested condition value
Literal '<literal>' Returns a value for comparison. This always evaluates to true. The literal value
Start anchor . <condition> Matches condition only if it matches the query from the start The matched sequence
End anchor <condition> . Matches condition only if it matches the query to the end The matched sequence

Comparison Condition Operators

The possible operators of a comparison condition are

Operator Meaning
=Left and right values are equal
<=Left value is smaller or equal
>=Left value is larger or equal
<Left value is smaller
>Left value is larger
=~Left value contains right value as a substring

Production List

A production list consists of a space-separated list of productions which are carried out when the production of a rule is matched. A production can be preceded by the type of term to produce, a label (index in queries), and followed by the weight (importance) of the produced value:

(<term-type>)?(<label>:)?<production>(!<weight>)?


Term Type

The default term type is the term type of the context which the term is added to. The possible explicit term types are

SyntaxMeaning
?Insert as an OR term
+Insert as an AND term
\$Insert as a value (RANK) term
-Insert as a NOT term

Label

If included, the label decides the label the produced term(s) will have in the namespace. This is the index in the query namespace.

Production

There are three types of productions:

ProductionSyntaxMeaning
Literal term <term> Produce this term literally
Literal term with value <term>='<value>' Produce this term and value literally.
Reference [<condition-reference>] Produce the terms matched by the referenced condition. The reference name is either the name of a named condition used in the condition, an ellipsis - … - or an explicit condition reference name.

Weight

The weight is a percentage integer denoting the importance of the produced term. The default is 100. In the query namespace the weight becomes the term weight, determining the relevance contribution of the term.