Ranking in Vespa is the computation that is done on matching documents during query execution. These are specified as ranking functions in rank profiles in the schema.
The special function named first-phase will determine the initial rank of the matches,
such that the top k can be selected as response to a query:
rank-profile my-rank-profile {
first-phase {
expression: 0.7 * bm25(text) + 0.3 * attribute(popularity)
}
}
The ranking functions can be any mathematical function combining rank features, including tensor math and machine-learned models.
The rank features these functions can use are of three categories:
attribute(fieldName): Any document field which has attribute in the indexing statement.query(name): Any value sent with the query as an input. When these are tensors
(not scalars) they must be declared as an input in the rank profile.
Refer to the full list of rank features.
Query features (inputs) that are tensors must be declared in the rank profile:
rank-profile my-rank-profile {
inputs {
query(user_context) tensor<float>(x[3])
}
first-phase {
expression: bm25(text) + sum(query(user_context) * attribute(document_context))
}
}
This is also how the type of query vectors in vector search are declared.
A schema can have any number of rank profiles specifying computations and ranking
for different use cases, experiments, and so on. Queries select one using the
ranking.profile parameter
in requests or a query profile.
If no profile is specified in the request, the one called default is used, and
if that isn't specified in the schema, a default one ranking by the nativeRank
feature is used. Another built-in rank profile unranked is also always available.
Specifying this boosts serving performance in queries which do not need ranking because ordering is not important or
explicit field sorting is used.
To avoid very long schema files, rank profiles can also be specified in their own files in the
application package, named
schemas/[schema-name]/[profile-name].profile.
See the schema reference for documentation
of all the content of rank profiles.
Rank profiles can inherit other profiles to avoid duplication, as in
rank-profile myProfile inherits default, another.
In addition to first-phase which specify the initial ranking that will be applied on all matching documents during matching, rank profiles can also specify functions that will be applied to rerank the top k documents before returning the final result. This is useful to direct more computation towards the most promising candidate documents:
schema myapp {
rank-profile my-rank-profile {
first-phase {
expression {
attribute(quality) * freshness(timestamp) + bm25(title)
}
}
second-phase {
expression: xgboost(my_xgboost_reranker)
rerank-count: 1000 # per content node
}
global-phase {
expression: sum(onnx(my_large_onnx_model))
rerank-count: 20 # globally
}
}
}
The second-phase expression is executed locally on the content node, using local data.
This is efficient on thousands of candidates. The global-phase expression
is executed on the global result set after merging, in the container node and is best used for
any very expensive and high quality final reranking.
See phased ranking for details.
A rank profile can define any number of functions which can be used in other ranking expressions or (when taking no arguments) be returned with results.
schema myapp {
rank-profile my-rank-profile {
function clickProbability() {
expression: xgboost('myClickModel')
}
function textRanking(field) {
expression: 0.7 * bm25(field) + 0.3 * nativeProximity(field)
}
first-phase {
expression {
0.1 * clickProbability()
0.2 * closeness(embeddingsField) +
0.3 * textRanking(titleField) +
0.4 * textRanking(bodyField)
}
}
summary-features {
clickProbability() # Returned with every mathed document
}
}
}
Read more in ranking expressions and functions.
In addition to ranking documents, a rank profile can also rank and select array elements within documents. This is most commonly used to select individual chunks within documents in RAG applications, see working with chunks.
The best quality is achieved by learning relevance functions using machine learning from a training set. Vespa lets you use machine-learned models in these formats in distributed ranking (first-and second phase):
As these are exposed as rank features, they can be used in ranking expressions exactly like any other rank feature.