• [+] expand all

Tensor Computation Examples

Tensors can be used to express machine-learned models such as neural nets, but they can be used for much more than that. The tensor model in Vespa is powerful, since it supports sparse dimensions, dimension names and lambda computations. Whatever you want to compute, it is probably possible to express it succinctly as a tensor expression - the problem is learning how. This page collects some real-world examples of tensor usage to provide some inspiration.

Tensor playground

The tensor playground is a tool to get familiar with and explore tensor algebra. It can be found at docs.vespa.ai/playground. Below are some examples of common tensor compute operations using tensor functions. Feel free to play around with them to explore further:

Values that depend on the current time

In an ecommerce application you may have promotions that sets a different product price in given time intervals. Since the price is used for ranking, the correct price must be computed in ranking. Can tensors be used to specify prices in arbitrary time intervals in documents and pick the right price during ranking?

To do this, add three tensors to the document type as follows:

field startTime type tensor(id{}} {
    indexing: attribute
field endTime type tensor(id{}} {
    indexing: attribute
field price type tensor(id{}} {
    indexing: attribute

Here the id is an arbitrary label for the promotion which must be unique within the document, and startTime and endTime are epoch timestamps.

Now documents can include promotions as follows (document JSON syntax):

"startTime": { "cells": { "promo1": 40, "promo2": 60, "promo3": 80 }
"endTime":   { "cells": { "promo1": 50, "promo2": 70, "promo3": 90 }
"price":     { "cells": { "promo1": 16, "promo2": 18, "promo3": 10 }

And we can retrieve the currently valid price by the expression

reduce((attribute(startTime) < now) * (attribute(endTime) > now) * attribute(price), max)

This will return 0 if there is no matching interval, so a full expression will probably wrap this in a function and check if it returns 0 (using an if expression) and return the default price of that product otherwise.

To see why this retrieves the right price, notice that (attribute(startTime) < now) is a shorthand for

join(attribute(startTime), now, f(x,y)(x < y))

That is joining all the cells of the startTime tensor by the zero-dimensional now tensor (i.e a number), and setting the cell value in the joined tensor to 1 if now is larger than the cell timestamp and 0 otherwise. When this tensor is joined by multiplication with one that has 1's only where now is smaller, the result is a tensor with 1's for promotion id's whose interval is currently valid and 0 otherwise. Then we can just join by multiplication with the price tensor to get the final tensor (on which we just pick the max value to retrieve the non-zero value.

Play around with this example in the playground

Adding scalars to a tensor

A common situation is that you have dense embedding vectors to which you want to add some scalar attributes (or function return values) as input to a machine-learned model. This can be done by the following expression (assuming the dense vector dimension is named "x":

concat(concat(query(embedding),attribute(embedding),x), tensor(x[2]):[bm25(title),attribute(popularity)], x)

This creates a tensor from a set of scalar expressions, and concatenates it to the query and document embedding vectors.

Play around with this example in the playground

Dot Product between query and document vectors

Assume we have a set of documents where each document contains a vector of size 4. We want to calculate the dot product between the document vectors and a vector passed down with the query and rank the results according to the dot product score.

The following schema file defines an attribute tensor field with a tensor type that has one indexed dimension x of size 4. In addition, we define a rank profile with the input and the dot product calculation:

schema example {
    document example {
        field document_vector type tensor<float>(x[4]) {
            indexing: attribute | summary
    rank-profile dot_product {
        inputs {
            query(query_vector) tensor<float>(x[4])
        first-phase {
            expression: sum(query(query_vector)*attribute(document_vector))

Example JSON document with the vector [1.0, 2.0, 3.0, 5.0], using indexed tensors short form:

        "put": "id:example:example::0",
        "fields": {
            "document_vector" : [1.0, 2.0, 3.0, 5.0]

Example query set in a searcher with the vector [1.0, 2.0, 3.0, 5.0]:

public Result search(Query query, Execution execution) {
        cell().label("x", 0).value(1.0).
        cell().label("x", 1).value(2.0).
        cell().label("x", 2).value(3.0).
        cell().label("x", 3).value(5.0).build());
    return execution.search(query);

Play around with this example in the playground

Note that this example calculates the dot product for every document retrieved by the query. Consider using approximate nearest neighbor search with distance-metric dotproduct.

Logistic regression models with cross features

One simple way to use machine-learning is to generate cross features from a set of base features and then do a logistic regression on these. How can this be expressed as Vespa tensors?

Assume we have three base features:

query(interests): tensor(interest{}) - A sparse, weighted set of the interests of a user.
query(location): tensor(location{})  - A sparse set of the location(s) of the user.
attribute(topics): tensor(topic{})   - A sparse, weighted set of the topics of a given document.

From these we have generated all 3d combinations of these features and trained a logistic regression model, leading to a weight for each possible combination:

tensor(interest{}, location{}, topic{})

This weight tensor can be added as a constant tensor to the application package, say constant(model). With that we can compute the model in a rank profile by the expression

sum(query(interests) * query(location) * attribute(topics) * constant(model))

Where the first three factors generates the 3d cross feature tensor and the last combines them with the learned weights.

Play around with this example in the playground

Matrix Product between 1d vector and 2d matrix

Assume we have a 3x2 matrix represented in an attribute tensor field document_matrix with a tensor type tensor<float>(x[3],y[2]) with content:

{ {x:0,y:0}:1.0, {x:1,y:0}:3.0, {x:2,y:0}:5.0, {x:0,y:1}:7.0, {x:1,y:1}:11.0, {x:2,y:1}:13.0 }

Also assume we have 1x3 vector passed down with the query as a tensor with type tensor<float>(x[3]) with content:

{ {x:0}:1.0, {x:1}:3.0, {x:2}:5.0 }

that is set as query(query_vector) in a searcher as specified in query feature.

To calculate the matrix product between the 1x3 vector and 3x2 matrix (to get a 1x2 vector) use the following ranking expression:

sum(query(query_vector) * attribute(document_matrix),x)

This is a sparse tensor product over the shared dimension x, followed by a sum over the same dimension.

Play around with this example in the playground

Using a tensor as a lookup structure

Tensors with mapped dimensions look similar to maps, but are more general. What if all needed is a simple map lookup? See tensor performance for more details.

Assume a tensor attribute my_map and this is the value for a specific document:


To create a query to select which of the 3 named vectors (a,b,c) to use for some other calculation, wrap the wanted label to look up inside a tensor. Assume a query tensor my_key with type/value:


Do the lookup, returning a tensor of type tensor<float>(y[3]):


If the key does not match anything, the result will be empty: tensor<float>(y[3]):[0,0,0]. For something else, add a check up-front to check if the lookup will be successful and run a fallback expression if it is not, like:

if(reduce(query(my_key)*attribute(my_map),count) == 3,

Slicing with lambda

A common use case is to use a tensor lambda function to slice out the first k dimensions of a vector representation of m dimensions where m is larger than k. Slicing with lambda functions is great for representing vectors from Matryoshka Representation Learning.

Matryoshka Representation Learning (MRL) which encodes information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks.
The following slices the first 256 dimensions of a tensor t:
Importantly, this does only reference into the original tensor, avoiding copying the tensory to a smaller tensor. The following is a complete example where we have stored an original vector representation with 3072 dimensions, And we slice the first 256 dimensions of the original representation to perform a dot product in the first-phase expression, followed by a full computation over all dimensions in the second-phase expression. See phased ranking for context on using Vespa phased computations and customizing reusable frozen embeddings with Vespa.

schema example {
  document example {
    field document_vector type tensor<float>(x[3072]) {
      indexing: attribute | summary
  rank-profile small-256-first-phase {
    inputs {
      query(query_vector) tensor<float>(x[3072])
    function slice_first_dims(t) {
      expression: l2_normalize(tensor<float>(x[256])(t{x:(x)}), x)
    first-phase {
      expression: sum( slice_first_dims(query(query_vector)) * slice_first_dims(attribute(document_vector)) )
    second-phase {
      expression: sum( query(query_vector) * attribute(document_vector) )
See also a runnable example in this tensor playground example.