The SPANN (Space Partitioned ANN) approach for approximate nearest neighbor search is described in SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search. SPANN uses a hybrid combination of graph and inverted index methods for approximate nearest neighbor search.
We recommend you read Billion-scale vector search using hybrid HNSW-IF for details on how SPANN is implemented using Vespa, before running this example application. Excerpt:
SPANN searches for the k closest centroid vectors of the query vector in the in-memory ANN search data structure. Then, it reads the k associated posting lists for the retrieved centroids and computes the distance between the query vector and the vector data read from the posting list:

This sample application demonstrates how to represent SPANN using Vespa.
Setup:
Create a tenant on Vespa Cloud:
Go to console.vespa-cloud.com and create your tenant (unless you already have one).
Install the Vespa CLI using Homebrew:
$ brew install vespa-cli
Windows/No Homebrew? See the Vespa CLI page to download directly.
Configure the Vespa client:
$ export VESPA_CLI_HOME=$PWD/.vespa
$ vespa config set target cloud $ vespa config set application vespa-team.autotest
Use the tenant name from step 1 instead of "vespa-team", and replace in other steps in this example guide, too.
Get Vespa Cloud control plane access:
$ vespa auth login
Follow the instructions from the command to authenticate.
Clone a sample application:
$ vespa clone billion-scale-vector-search myapp && cd myapp
See sample-apps for other sample apps you can clone.
Add a certificate for data plane access to the application:
$ vespa auth cert app
It is a good idea to take note of the path to the .pem files written here.
This sample app uses the Microsoft SPACEV vector dataset from big-ann-benchmarks.com. It uses the first 10M vectors of the 100M slice sample. This sample file is about 1GB (10M vectors):
$ curl -L -o spacev10m_base.i8bin \ https://data.vespa-cloud.com/sample-apps-data/spacev10m_base.i8bin
Install dependencies and create the feed files for the first 10M vectors from the 100M sample:
$ pip3 install numpy requests tqdm
$ python3 app/src/main/python/create-vespa-feed.py spacev10m_base.i8bin
Output:
graph-vectors.jsonlif-vectors.jsonlBuild the application:
$ mvn clean package -U -f app
Deploy the application:
$ vespa deploy --wait 900 ./app
Wait for the application endpoint to become available:
$ vespa status --wait 300
Test basic functionality:
$ vespa test app/src/test/application/tests/system-test/feed-and-search-test.json
See CD tests for details.
The graph vectors must be feed before the if vectors:
$ vespa feed graph-vectors.jsonl
$ vespa feed if-vectors.jsonl
Now is a good time to open the Vespa Cloud Dashboard to track progress.
Refer to <resources> configuration to manage the feeding speed - more CPU is better, e.g.:
<resources vcpu="8" memory="16Gb" disk="50Gb"/>
Use the instance type reference to find good combinations. Run time for a 2 VCPU deployment vs. 8 VCPU:

Observe the feed and query phases (below) of this guide:

Download the query vectors and the ground truth for the 10M first vectors:
$ curl -L -o query.i8bin \ https://github.com/microsoft/SPTAG/raw/main/datasets/SPACEV1B/query.bin $ curl -L -o spacev10m_gt100.i8bin \ https://data.vespa-cloud.com/sample-apps-data/spacev10m_gt100.i8bin
Find the path to the credentials from the vespa auth cert step above, like
/Users/username/.vespa/tenant_name.autotest.default/data-plane-public-cert.pem
Replace the two filenames in the command below. (This is not needed when running a local test)
Run first 1K queries and evaluate recall@10. A higher number of clusters gives higher recall:
$ ENDPOINT=$(vespa status --format=plain)
$ python3 app/src/main/python/recall.py \
--endpoint ${ENDPOINT}/search/ \
--query_file query.i8bin \
--query_gt_file spacev10m_gt100.i8bin \
--certificate $PWD/../.vespa/vespa-team.autotest.default/data-plane-public-cert.pem \
--key $PWD/../.vespa/vespa-team.autotest.default/data-plane-private-key.pem
See the blog post for details about this script.
$ vespa destroy --force
Prerequisites:
$ podman machine init --memory 6000 $ podman machine start
NO_SPACE - the vespaengine/vespa container image + headroom for data requires disk space.
Read more.
Verify memory Limits:
$ docker info | grep "Total Memory"or
$ podman info | grep "memTotal"
Install Vespa CLI:
$ brew install vespa-cli
For local deployment:
$ vespa config set target local
Download this sample application:
$ vespa clone billion-scale-vector-search myapp && cd myapp
Pull and start the Vespa image:
$ docker pull vespaengine/vespa $ docker run --detach --name vespa --hostname vespa-container \ --publish 127.0.0.1:8080:8080 --publish 127.0.0.1:19071:19071 \ vespaengine/vespa
Verify that the configuration service (deploy api) is ready:
$ vespa status deploy --wait 300
At this point, you can continue the guide from download vector data.
When done, remove the container:
$ docker rm -f vespa