Tensors can be used to express machine-learned models such as neural nets, but they can be used for much more than that. The tensor model in Vespa is powerful, since it supports sparse dimensions, dimension names and lambda computations. Whatever you want to compute, it is probably possible to express it succinctly as a tensor expression - the problem is learning how. This page collects some real-world examples of tensor usage to provide some inspiration.
The tensor playground is a tool to get familiar with and explore tensor algebra. It can be found at docs.vespa.ai/playground. Below are some examples of common tensor compute operations using tensor functions. Feel free to play around with them to explore further:
In an ecommerce application you may have promotions that sets a different product price in given time intervals. Since the price is used for ranking, the correct price must be computed in ranking. Can tensors be used to specify prices in arbitrary time intervals in documents and pick the right price during ranking?
To do this, add three tensors to the document type as follows:
field startTime type tensor(id{}} { indexing: attribute } field endTime type tensor(id{}} { indexing: attribute } field price type tensor(id{}} { indexing: attribute }
Here the id is an arbitrary label for the promotion which must be unique within the document, and startTime and endTime are epoch timestamps.
Now documents can include promotions as follows (document JSON syntax):
"startTime": { "cells": { "promo1": 40, "promo2": 60, "promo3": 80 } "endTime": { "cells": { "promo1": 50, "promo2": 70, "promo3": 90 } "price": { "cells": { "promo1": 16, "promo2": 18, "promo3": 10 }
And we can retrieve the currently valid price by the expression
reduce((attribute(startTime) < now) * (attribute(endTime) > now) * attribute(price), max)
This will return 0 if there is no matching interval, so a full expression will probably wrap this in a function and check if it returns 0 (using an if expression) and return the default price of that product otherwise.
To see why this retrieves the right price, notice that (attribute(startTime) < now)
is a shorthand for
join(attribute(startTime), now, f(x,y)(x < y))
That is joining all the cells of the startTime
tensor by the zero-dimensional now
tensor (i.e a number), and setting the cell value in the joined tensor to 1 if now is larger than the cell
timestamp and 0 otherwise.
When this tensor is joined by multiplication with one that has 1's only where now is smaller,
the result is a tensor with 1's for promotion ids whose interval is currently valid and 0 otherwise.
Then we can just join by multiplication with the price tensor to get the final tensor (on which we just pick the
max value to retrieve the non-zero value.
Play around with this example in the playground
A common situation is that you have dense embedding vectors to which you want to add some scalar attributes (or function return values) as input to a machine-learned model. This can be done by the following expression (assuming the dense vector dimension is named "x":
concat(concat(query(embedding),attribute(embedding),x), tensor(x[2]):[bm25(title),attribute(popularity)], x)
This creates a tensor from a set of scalar expressions, and concatenates it to the query and document embedding vectors.
Play around with this example in the playground
Assume we have a set of documents where each document contains a vector of size 4. We want to calculate the dot product between the document vectors and a vector passed down with the query and rank the results according to the dot product score.
The following schema file defines an attribute tensor field
with a tensor type that has one indexed dimension x
of size 4.
In addition, we define a rank profile with the input and the dot product calculation:
schema example { document example { field document_vector type tensor<float>(x[4]) { indexing: attribute | summary } } rank-profile dot_product { inputs { query(query_vector) tensor<float>(x[4]) } first-phase { expression: sum(query(query_vector)*attribute(document_vector)) } } }
Example JSON document with the vector [1.0, 2.0, 3.0, 5.0], using indexed tensors short form:
[ { "put": "id:example:example::0", "fields": { "document_vector" : [1.0, 2.0, 3.0, 5.0] } } ]
Example query set in a searcher with the vector [1.0, 2.0, 3.0, 5.0]:
public Result search(Query query, Execution execution) { query.getRanking().getFeatures().put("query(query_vector)", Tensor.Builder.of(TensorType.fromSpec("tensor<float>(x[4])")). cell().label("x", 0).value(1.0). cell().label("x", 1).value(2.0). cell().label("x", 2).value(3.0). cell().label("x", 3).value(5.0).build()); return execution.search(query); }
Play around with this example in the playground
Note that this example calculates the dot product for every document retrieved by the query. Consider
using approximate nearest neighbor search with
distance-metric
dotproduct.
One simple way to use machine-learning is to generate cross features from a set of base features and then do a logistic regression on these. How can this be expressed as Vespa tensors?
Assume we have three base features:
query(interests): tensor(interest{}) - A sparse, weighted set of the interests of a user. query(location): tensor(location{}) - A sparse set of the location(s) of the user. attribute(topics): tensor(topic{}) - A sparse, weighted set of the topics of a given document.
From these we have generated all 3d combinations of these features and trained a logistic regression model, leading to a weight for each possible combination:
tensor(interest{}, location{}, topic{})
This weight tensor can be added as a
constant tensor
to the application package, say constant(model)
. With that we can compute the model
in a rank profile by the expression
sum(query(interests) * query(location) * attribute(topics) * constant(model))
Where the first three factors generates the 3d cross feature tensor and the last combines them with the learned weights.
Play around with this example in the playground
Assume we have a 3x2 matrix represented in an attribute tensor field document_matrix
with a tensor type tensor<float>(x[3],y[2])
with content:
{ {x:0,y:0}:1.0, {x:1,y:0}:3.0, {x:2,y:0}:5.0, {x:0,y:1}:7.0, {x:1,y:1}:11.0, {x:2,y:1}:13.0 }
Also assume we have 1x3 vector passed down with the query as a tensor
with type tensor<float>(x[3])
with content:
{ {x:0}:1.0, {x:1}:3.0, {x:2}:5.0 }
that is set as query(query_vector)
in a searcher
as specified in query feature.
To calculate the matrix product between the 1x3 vector and 3x2 matrix (to get a 1x2 vector) use the following ranking expression:
sum(query(query_vector) * attribute(document_matrix),x)
This is a sparse tensor product over the shared dimension x
,
followed by a sum over the same dimension.
Play around with this example in the playground
Tensors with mapped dimensions look similar to maps, but are more general. What if all needed is a simple map lookup? See tensor performance for more details.
Assume a tensor attribute my_map
and this is the value for a specific document:
tensor<float>(x{},y[3]):{a:[1,2,3],b:[4,5,6],c:[7,8,9]}
To create a query to select which of the 3 named vectors (a,b,c) to use for some other calculation,
wrap the wanted label to look up inside a tensor.
Assume a query tensor my_key
with type/value:
tensor<float>(x{}):{b:1.0}
Do the lookup, returning a tensor of type tensor<float>(y[3])
:
sum(query(my_key)*attribute(my_map),x)
If the key does not match anything, the result will be empty: tensor<float>(y[3]):[0,0,0]
.
For something else, add a check up-front to check if the lookup will be successful
and run a fallback expression if it is not, like:
if(reduce(query(my_key)*attribute(my_map),count) == 3, reduce(query(my_key)*attribute(my_map),sum,x), tensor<float>(y[3]):[0.5,0.5,0.5])
(y*x){x:b}
.
The above syntax allows an optimized execution, find an example in the
Tensor Playground.
A common use case is to use a tensor lambda function to slice out the first k
dimensions of a vector representation of m
dimensions where m
is larger than k
.
Slicing with lambda functions is great for representing vectors from Matryoshka Representation Learning.
Matryoshka Representation Learning (MRL) which encodes information at different granularities and allows a single embedding to adapt to the computational constraints of downstream tasks.
The following slices the first 256 dimensions of a tensor t
:
tensor<float>(x[256])(t{x:(x)})
Importantly, this does only reference into the original tensor, avoiding copying the tensor to a smaller tensor.
The following is a complete example where we have stored an original vector representation with 3072 dimensions, And we slice the first 256 dimensions of the original representation to perform a dot product in the first-phase expression, followed by a full computation over all dimensions in the second-phase expression. See phased ranking for context on using Vespa phased computations and customizing reusable frozen embeddings with Vespa.
schema example { document example { field document_vector type tensor<float>(x[3072]) { indexing: attribute | summary } } rank-profile small-256-first-phase { inputs { query(query_vector) tensor<float>(x[3072]) } function slice_first_dims(t) { expression: l2_normalize(tensor<float>(x[256])(t{x:(x)}), x) } first-phase { expression: sum( slice_first_dims(query(query_vector)) * slice_first_dims(attribute(document_vector)) ) } second-phase { expression: sum( query(query_vector) * attribute(document_vector) ) } } }
See also a runnable example in this tensor playground example.