Stateless Model Evaluation

Vespas speciality is evaluating machine-learned models quickly over large numbers of data points. However, it can also be used to evaluate models once on request in stateless containers. All machine-learned models (see e.g TensorFlow and Onnx) added to the models/ directory of the application package, can be used to compute inferences from Java code using the following step.

1. Add the model evaluation tag

Add the model-evaluation tag inside the container (/jdisc) clusters where it is needed in services.xml:

<container>
    ...
    <model-evaluation/>
    ...
</container>

2. Add a model-evaluation dependency

Add the following dependency in your pom.xml:

        <dependency>
            <groupId>com.yahoo.vespa</groupId>
            <artifactId>container</artifactId>
            <scope>provided</scope>
        </dependency>

(Or, if you want the minimal dependency, depend on model-evaluation instead of container.)

3. Have ModelsEvaluator injected

Have the Java component which should evaluate models take a ai.vespa.models.evaluation.ModelsEvaluator instance as a constructor argument (Vespa will automatically inject it).

4. Use ModelsEvaluator to make evaluations

Use the ModelsEvaluator API (from any thread) to make inferences. Sample code:

import ai.vespa.models.evaluation.ModelsEvaluator;
import ai.vespa.models.evaluation.FunctionEvaluator;
import com.yahoo.tensor.Tensor;

... 

FunctionEvaluator evaluator = modelsEvaluator.evaluatorOf("myModel", "mySignature", "myOutput"); // Unambiguous args may be skipped

Tensor.Builder b = Tensor.Builder.of(new TensorType.Builder().indexed("d0", 3));
b.cell(0.1, 0);
b.cell(0.2, 0);
b.cell(0.3, 0);
Tensor input = b.build();

evaluator.bind("myInput", input);
Tensor result = evaluator.evaluate()); // Note: Evaluator must be discarded after a single use