Vespa provides a tensor data model and computation engine to support advanced computations over data, such as for example computing neural network ranking models. In this guide, tensors are introduced with some examples of use. For details, refer to the tensor reference. This guide goes through:
For a quick introduction to tensors, refer to tensor concepts and the tensor reference guide. See also using TensorFlow models or using ONNX models which explains how to use TensorFlow or ONNX models directly in Vespa, and a collection of examples of usages of tensors to perform various computations.
Also, please explore the tensor playground. It is a tool to get familiar with and explore tensors and tensor expressions in a safe environment.
In typical use, a document contains one or more tensor fields to be used for ranking 
this example sets up a tensor field called tensor_attribute
:
field tensor_attribute type tensor<float>(x[4]) { indexing: attribute  summary }
A tensor requires a type  x[4]
means indexed dimension of size 4,
where x{}
means mapped dimension  see
tensor field in schemas.
For details on tensor types, refer to the
tensor type reference.
There are two options when feeding tensors.
Tensors can be updated  one can add, remove and modify tensor cells, or assign a completely new tensor value.
Tensors can only be used for ranking, not searching. The tensor can either be supplied in the query request, or constructed from some other data or data source. In the latter case, refer to the Tensor Java API for details on how to construct tensors programmatically. Query/Context tensors must be defined in the application package (name, type and dimension), see defining query feature types.
Tensors are used during ranking to modify a document's rank score given the query. Typical operations are dot products between tensors of order 1 (vectors), or matrix products between tensors of order 2 (matrices). Tensors are used in rank expressions as rank features. Two rank features are defined above:
attribute(tensor_attribute)
: the tensor associated with the documentquery(tensor)
: the tensor sent with the query requestThese can be used in rank expressions. Note that the final rank score of a document must be a single double value  example
:rankprofile dot_product { firstphase { expression: sum(query(tensor)*attribute(tensor_attribute)) } }
This takes the product of the query tensor and the document tensor, and sums all fields thus resolving into a single value which is used as the rank score. In the case above, the value is 39.0 (1*1 + 2*2 + 3*3 + 5*5).
There are some ranking functions that are specific for tensors:
map(tensor, f(x)(...))  Returns a new tensor with the lambda function defined in f(x)(...) applied to each cell. 
reduce(tensor, aggregator, dim1, dim2, ...)  Returns a new tensor with the aggregator applied across dimensions dim1, dim2, etc. If no dimensions are specified, reduce over all dimensions. 
join(tensor1, tensor2, f(x,y)(...))  Returns a new tensor constructed from the natural join between tensor1 and tensor2 , with the resulting cells having the value as calculated from f(x,y)(...) , where x is the cell value from tensor1 and y from tensor2 . 
rankprofile dot_product { firstphase { expression { reduce( join( query(tensor), attribute(tensor_attribute), f(x,y)(x * y) ), sum ) } } }
...and represents the general dot product for tensors of any order. Details about tensor ranking functions including lambda expression and available aggregators can be found in the tensor reference documentation.
Create a tensor on the fly from another field type:
tensor(x{}):{x1:attribute(foo)}
tensorFrom...
functions  see
document features.Tensor expressions are fairly concise, and since the expressions themselves are independent of the data size, the actual workload during ranking can be significant for large tensors.
When using tensors in ranking it is important to have an understanding of the
potential computational cost for each query. As an example, assume
the dot product of two tensors with 1000 values each, e.g. tensor<double>(x[1000])
.
Assuming one query tensor and one document tensor, the operation is:
sum(query(tensor1) * attribute(tensor2))
If 8 bytes is used to store each value (e.g. using a double), each tensor is approximately 8 KB. With for instance a Haswell architecture the theoretical upper memory bandwidth is 68 GB/s, which is around 9 million document ranking evaluations per second. With 1 million documents, this means the maximum throughput, in regard to pure memory bandwidth, is 9 queries per second (per node).
Even though you would typically not do the above without reducing the search space first (using matching and first phase), it is important to consider the memory bandwidth and other hardware limitations when developing ranking expressions with tensors.
double  The 64bit floatingpoint "double" format is the default cell type. It gives best precision at the cost of high memory usage and somewhat slower calculations. Using a smaller value type increases performance, trading off precision, so consider changing to one of the cell types below before scaling your application. 

float 
The usual 32bit floatingpoint format "float" should usually
be used for all tensors when scaling for production.
(Note that other frameworks, like tensorflow, will also prefer 32bit floats.)
A vector with 1000 dimensions, tensor<float>(x[1000]) would then
use approx 4K memory per tensor value.

bfloat16 
If memory (or memory bandwidth) is still a concern, it's
possible to change the most spaceconsuming tensors to use
the
Note that when doing calculations
In some cases, having tensors with 
int8 
If one uses machinelearning to generate a model with data
quantization you can target the int8 cell value
type, which is a signed integer with range from 128 to +127 only.
This is also treated like a "float with limited range and lossy
compression" by the Vespa tensor framework, and gives results as if it
was a 32bit float when any calculation is done. This type is also
suitable when representing boolean values (0 or 1). Note that if the input
for an int8 cell is not directly representable, the
resulting cell value is undefined, so you should take care to only
input numbers in the [128,127] range.
It's also possible to use int8 representing binary
data for hamming distance
NearestNeighbor search.

In addition to document tensors and query tensors, constant tensors can be put in the application package. This is useful when constant tensors are used in ranking expressions, for instance machine learned models. Example:
constant tensor_constant { file: constants/constant_tensor_file.json type: tensor<float>(x[4]) }
This defines a new tensor rank feature with the type as defined and the contents distributed with the application package in the file constants/constant_tensor_file.json. The format of this file is the tensor JSON format, it can be compressed, see the reference for examples.
To use this tensor in a rank expression, encapsulate the constant name with constant(...)
:
rankprofile use_constant_tensor { firstphase { expression: sum(query(tensor) * attribute(tensor_attribute) * constant(tensor_constant)) } }
The above expression combines three tensors: the query tensor, the document tensor and a constant tensor.
In Vespa, a tensor is a data structure which is a generalization of scalars, vectors and matrices. Tensors can have any order:
Tensors consist of a set of double valued cells, with each cell having a unique address. A cell's address is specified by its index or label along all dimensions. The number of dimensions in a tensor is the rank of the tensor. Each dimension can be either mapped or indexed. Mapped dimensions are sparse and allow any label (string identifier) designating their address, while indexed dimensions use dense numeric indices starting at 0.
Example: Using literal form, the tensor:
{ {x:2, y:1}:1.0, {x:0, y:2}:1.0 }
has two dimensions named x
and y
, and has two cells with defined values:
A type declaration is needed for tensors. This defines a 2dimensional mapped tensor (matrix) of float:
tensor<float>(x{},y{})This is a 2dimensional indexed tensor (a 2x3 matrix) of double:
tensor<double>(x[2],y[3])A combination of mapped and indexed dimensions is a mixed tensor:
tensor<float>(x{},y[3])
Vespa uses the type information to optimize execution plans at configuration time. For dense data the best performance is achieved with indexed dimensions.