Ranking
Vespa defines Big Data Serving as:
Selection, organization and machine-learned model inference,
Ranking enables organization and ML inference, and multi-phased ranking addresses latency and load:
schema myapp { rank-profile my-rank-profile { first-phase { expression: attribute(quality) * freshness(timestamp) } second-phase { expression: sum(onnx(my_onnx_model)) } } }Applications use the query API to select the documents to evaluate using a query language, and choose a rank profile for the ranking.
Ranking is running ranking expressions using rank features (values / computed values from queries, document and constants).
Note: Vespa also supports stateless model evaluation - making inferences without documents (i.e. query to model).
Performance
Rank profiles can have one or two phases:
- Phase one should use a computationally inexpensive function to rank candidates. This phase is about recall, select the best candidates.
- Phase two is run on a small candidate set. This phase is about precision - use more resources to fine tune ranking for best candidates on top.
- Control the second phase candidate set size
- Add content nodes to rank less documents per node
Machine-Learned model inference
Vespa supports the following ML models:
As these are exposed as rank features, it is possible to rank using a model ensemble. Deploy multiple model instances and write a rank expression that combines the results (max, avg, custom, ...) - example:schema myapp { onnx-model my_model_1 { ... } onnx-model my_model_2 { ... } rank-profile my-rank-profile { ... second-phase { expression: max( sum(onnx(my_model_1), sum(onnx(my_model_2) ) } } }
Model Training and Deployment
To use data in Vespa to train a model, refer to the Learning to Rank guide.
Models are deployed in application packages. Read more on how to automate training, deployment and re-training in a closed loop using Vespa Cloud.
Rank profile
Ranking expressions are stored in rank profiles. An application can have multiple rank profiles - this can be used for implementing different use cases, or bucket testing ranking variations. If not specified, the default text ranking profile is used.
A rank profile can inherit another rank profile.
Queries select rank profile using ranking.profile, or in Searcher code:
query.getRanking().setProfile("my-rank-profile");Note that some use cases (where hits can be in any order, or explicitly sorted) performs better using the unranked rank profile.