• [+] expand all

Ranking

Ranking is where Vespa does computing, or inference over documents retrieved by a query. The goal is to order (rank) the documents retrieved.

The computations are expressed in functions called ranking expressions, bundled into rank profiles defined in schemas. These can range from simple math expressions combining some rank features, to tensor expressions or large machine-learned Onnx models.

Two-phase ranking

Rank profiles can define two phases that are evaluated locally on content nodes, which means that no data needs to be transferred to container nodes to make inferences over data:

schema myapp {

    rank-profile my-rank-profile {

        num-threads-per-search:4

        first-phase {
            expression {
                attribute(quality) * freshness(timestamp)
            }
        }

        second-phase {
            expression: sum(onnx(my_onnx_model))
            rerank-count: 50
        }

    }
}

The first phase is executed for all matching documents while the second is executed for the top-scoring rerank-count documents per content node as scored by the first-phase function. This is useful to direct more computation towards the most promising candidate documents, see phased ranking.

Global-phase ranking

It's also possible to define an additional phase that runs on the stateless container nodes after merging hits from the content nodes. Please read the global-phase as part of phased ranking documentation for more details. This can be more efficient use of CPU (especially with many content nodes) and can be used instead of second-phase, or in addition to a moderately expensive second-phase as in the example below. This phase also supports GPU acceleration.

schema myapp {
    rank-profile my-rank-profile {
        first-phase {
            expression: attribute(quality) * freshness(timestamp)
        }
        second-phase {
            expression {
                my_combination_of(fieldMatch(title), bm25(abstract), attribute(quality), freshness(timestamp))
            }
        }
        global-phase {
            expression: sum(onnx(my_onnx_model))
            rerank-count: 100
        }
    }
}

Machine-Learned model inference

Vespa supports ML models in these formats:

As these are exposed as rank features, it is possible to rank using a model ensemble. Deploy multiple model instances and write a ranking expression that combines the results:

schema myapp {

    onnx-model my_model_1 {
        ...
    }
    onnx-model my_model_2 {
        ...
    }

    rank-profile my-rank-profile {
    ...
        second-phase {
            expression: max( sum(onnx(my_model_1), sum(onnx(my_model_2) )
        }
    }
}

Model Training and Deployment

Models are deployed in application packages. Read more on how to automate training, deployment and re-training in a closed loop using Vespa Cloud.

Rank profiles

Ranking expressions are defined in rank profiles - either inside the schema or equivalently in their own files in the application package, named schemas/[schema-name]/[profile-name].profile.

One schema can have any number of rank profiles for implementing e.g. different use cases or bucket testing variations. If no profile is specified, the default text ranking profile is used.

Rank profiles can inherit other profiles. This makes it possible to define complex profiles and variants without duplication.

Queries select a rank profile using the ranking.profile argument in requests or a query profiles, or equivalently in Searcher code, by

query.getRanking().setProfile("my-rank-profile");

If no profile is specified in the query, the one called default is used. This profile is available also if not defined explicitly.

Another special rank profile called unranked is also always available. Specifying this boosts performance in queries which do not need ranking because random order is fine or explicit sorting is used.

Text ranking

The default ranking is the first-phase function nativeRank, that is a function returning the value of the nativeRank rank feature, and no second-phase. This default text scoring feature only considers how well a query matches the searched field/fieldset.

The overall ranking expression might contain other ranking dimensions than just text match, like freshness, the quality of the document, or any other property of the document or query.

A simple alternative to nativeRank for text scoring is using the BM25 feature.

Another text matching feature is fieldMatch(field) string segment match. This feature combines the more basic fieldMatch sub-features in a reasonable way but has a high computional cost compared to nativeRank and BM25 and is only suitable for second-phase evaluation.

Weight, significance and connectedness

Modify the values of the match features from the query by sending weight, significance and connectedness with the query:

Feature inputDescription
Weight

Set query term weight. Example: ... where (title contains ({weight:200}"heads") AND title contains "tails") specifies that heads is twice as important for the final rank score than tails (the default weight is 100).

The term weight is used in several text scoring features, including fieldMatch(name).weight and nativeRank. Note that the term weight is not applicable for all text scoring features, for example bm25 does not use the term weigth.

Configure static field weights in the schema.

Significance

Significance is an indication of how rare a term is in the corpus of the language, used by a number of text matching rank features. This can be set explicitly for each term in the query, or by calling item.setSignificance() in a Searcher.

With indexed search, default significance values are calculated automatically during indexing. However, unless the indexed corpus is representative of the word frequencies in the user's language, relevance can be improved by passing significances derived from a representative corpus. Relative significance is accessible in ranking through the fieldMatch(name).significance feature. Weight and significance are also averaged into fieldMatch(name).importance for convenience.

Streaming search does not compute term significance, queries should pass this with the query terms. Read more.

Connectedness

Signify the degree of connection between adjacent terms in the query - set query term connectivity to another term.

For example, the query new york newspaper should have a higher connectedness between the terms "new" and "york" than between "york" and "newspaper" to rank documents higher if they contain "new york" as a phrase.

Term connectedness is taken into account by fieldMatch(name).proximity, which is also an important contribution to fieldMatch(name). Connectedness is a normalized value which is 0.1 by default. It must be set by a custom Searcher, looking up connectivity information from somewhere - there is no query syntax for it.