• [+] expand all

Ranking

Ranking is where Vespa does computing, or inference over documents. The computations to be done are expressed in functions called ranking expressions, bundled into rank profiles defined in schemas. These can range from simple math expressions combining some rank features, to tensor expressions or large machine-learned Onnx models.

These ranking expressions are evaluated locally on content nodes, which means that no data needs to be transferred to computing nodes to make inferences over data (it's also possible to define additional models to be evaluated on stateless container nodes).

Two-phase ranking

Rank profiles can define two phases:

schema myapp {

    rank-profile my-rank-profile {

        num-threads-per-search:4

        first-phase {
            expression {
                attribute(quality) * freshness(timestamp)
            }
        }

        second-phase {
            expression: sum(onnx(my_onnx_model))
            rerank-count: 50
        }

    }
}

The first phase is executed for all matching documents while the second is executed for the best rerank-count documents per content node according to the first-phase function. This is useful to direct more computation towards the most promising candidate documents, see phased ranking.

Machine-Learned model inference

Vespa supports ML models in these formats:

As these are exposed as rank features, it is possible to rank using a model ensemble. Deploy multiple model instances and write a ranking expression that combines the results:

schema myapp {

    onnx-model my_model_1 {
        ...
    }
    onnx-model my_model_2 {
        ...
    }

    rank-profile my-rank-profile {
    ...
        second-phase {
            expression: max( sum(onnx(my_model_1), sum(onnx(my_model_2) )
        }
    }
}

Model Training and Deployment

To use data in Vespa to train a model, refer to the Learning to Rank guide.

Models are deployed in application packages. Read more on how to automate training, deployment and re-training in a closed loop using Vespa Cloud.

Rank profiles

Ranking expressions are defined in rank profiles - either inside the schema or equivalently in their own files in the application package, named schemas/[schema-name]/[profile-name].profile.

One schema can have any number of rank profiles for implementing e.g. different use cases or bucket testing variations. If no profile is specified, the default text ranking profile is used.

Rank profiles can inherit other profiles. This makes it possible to define complex profiles and variants without duplication.

Queries select a rank profile using the ranking.profile argument in requests or a query profiles, or equivalently in Searcher code, by

query.getRanking().setProfile("my-rank-profile");

If no profile is specified in the query, the one called default is used. This profile is available also if not defined explicitly.

Another special rank profile called unranked is also always available. Specifying this boosts performance in queries which do not need ranking because random order is fine or explicit sorting is used.