Ranking is where Vespa does computing, or inference over documents. The computations to be done are expressed in functions called ranking expressions, bundled into rank profiles defined in schemas. These can range from simple math expressions combining some rank features, to tensor expressions or large machine-learned Onnx models.
Rank profiles can define two phases that are evaluated locally on content nodes, which means that no data needs to be transferred to container nodes to make inferences over data:
schema myapp { rank-profile my-rank-profile { num-threads-per-search:4 first-phase { expression { attribute(quality) * freshness(timestamp) } } second-phase { expression: sum(onnx(my_onnx_model)) rerank-count: 50 } } }
The first phase is executed for all matching documents while the second is executed for the best rerank-count documents per content node according to the first-phase function. This is useful to direct more computation towards the most promising candidate documents, see phased ranking.
It's also possible to define an additional phase that runs on the stateless container nodes after merging hits from the content nodes. Please read the global-phase as part of phased ranking documentation for more details. This can be more efficient use of CPU (especially with many content nodes) and can be used instead of second-phase, or in addition to a moderately expensive second-phase as in the example below.
schema myapp { rank-profile my-rank-profile { first-phase { expression: attribute(quality) * freshness(timestamp) } second-phase { expression { my_combination_of(fieldMatch(title), bm25(abstract), attribute(quality), freshness(timestamp)) } } global-phase { expression: sum(onnx(my_onnx_model)) rerank-count: 100 } } }
Vespa supports ML models in these formats:
As these are exposed as rank features, it is possible to rank using a model ensemble. Deploy multiple model instances and write a ranking expression that combines the results:
schema myapp { onnx-model my_model_1 { ... } onnx-model my_model_2 { ... } rank-profile my-rank-profile { ... second-phase { expression: max( sum(onnx(my_model_1), sum(onnx(my_model_2) ) } } }
Models are deployed in application packages. Read more on how to automate training, deployment and re-training in a closed loop using Vespa Cloud.
Ranking expressions are defined in rank profiles -
either inside the schema or equivalently in their own files in the
application package, named
schemas/[schema-name]/[profile-name].profile
.
One schema can have any number of rank profiles for implementing e.g. different use cases or bucket testing variations. If no profile is specified, the default text ranking profile is used.
Rank profiles can inherit other profiles. This makes it possible to define complex profiles and variants without duplication.
Queries select a rank profile using the ranking.profile argument in requests or a query profiles, or equivalently in Searcher code, by
query.getRanking().setProfile("my-rank-profile");
If no profile is specified in the query, the one called default
is used.
This profile is available also if not defined explicitly.
Another special rank profile called unranked
is also always available.
Specifying this boosts performance in queries which do not need ranking because random order is fine or
explicit sorting is used.
Rank profiles are not evaluated lazily. Example:
function inline foo(tensor, defaultVal) { expression: if (count(tensor) == 0, defaultValue, sum(tensor)) } function bar() { expression: foo(tensor, sum(tensor1 * tensor2)) }
Will the sum
in the bar
function be computed lazily,
meaning only if tensor
is empty?
No, this would require lambda arguments. Only doubles and tensors are passed between functions.
The default ranking is the first-phase function nativeRank
,
that is a function returning the value of the nativeRank rank feature,
and no second-phase.
A good simple alternative to nativeRank
for text ranking is using the
BM25 rank feature.
If the expression is written manually, it might be most convenient to
stick with using the fieldMatch(name)
feature for each field.
This feature combines the more basic fieldMatch features in a reasonable way.
A good way to combine the fieldMatch score of each field is to use a weighted average as explained above.
Another way is to combine the field match scores
using the fieldMatch(name).weight/significance/importance
features
which takes term weight or rareness or both into account
and allows a normalized score to be produced by simply summing the product of this feature
and any other normalized per-field score for each field.
In addition, some attribute value(s) must usually be included
to determine the a priori quality of each document.
For example, assuming the title field is more important than the body field,
create a ranking expression which gives more weight to that field, as in the example above.
Vespa contains some built-in convenience support for this -
weights can be set in the individual fields by weight: <number>
and the feature match
can be used to get a weighted average
of the fieldMatch scores of each field.
The overall ranking expression might contain other ranking dimensions than just text match,
like freshness, the quality of the document, or any other property of the document or query.
Modify the values of the match features from the query by sending weight, significance and connectedness with the query:
Feature input | Description |
---|---|
Weight |
Set query term weight.
Example: Weight is used in fieldMatch(name).weight, which can be multiplied with fieldMatch(name) to yield a weighted score for the field, and in fieldMatch(name).weightedOccurrence to get an occurrence score which is higher if higher weighted terms occurs most. Configure static field weights in the schema. |
Significance |
Significance is an indication of how rare a term is in the corpus of the language, used by a number of text matching rank features. This can be set explicitly for each term in the query, or by calling item.setSignificance() in a Searcher. With indexed search, default significance values are calculated automatically during indexing. However, unless the indexed corpus is representative of the word frequencies in the user's language, relevance can be improved by passing significances derived from a representative corpus. Relative significance is accessible in ranking through the fieldMatch(name).significance feature. Weight and significance are also averaged into fieldMatch(name).importance for convenience. Streaming search does not compute term significance, queries should pass this with the query terms. Read more. |
Connectedness |
Signify the degree of connection between adjacent terms in the query - set query term connectivity to another term.
For example, the query Term connectedness is taken into account by fieldMatch(name).proximity, which is also an important contribution to fieldMatch(name). Connectedness is a normalized value which is 0.1 by default. It must be set by a custom Searcher, looking up connectivity information from somewhere - there is no query syntax for it. |