Approximate Nearest Neighbor Search using HNSW Index

For an introduction to nearest neighbor search see Nearest Neighbor Search documentation. This document describes how to speed up searches for nearest neighbors by adding a HNSW index to the tensor field. Vespa implements a modified version of the Hierarchical Navigable Small World (HNSW) graphs algorithm (paper). The implementation in Vespa supports:

  • Meta Data Filtering - The search for nearest neighbors can be constrained by query filters as the nearest neighbor search in Vespa is expressed as a query operator. This can be combined with other filters or query terms using the Vespa query language.
  • Real Time Indexing - Add, remove and update vectors in the index with low latency, using the document JSON format for tensors.
  • Mutable HNSW Index - This ensures no query overhead from having to search multiple HNSW indexes.
  • Multi-threaded Indexing - The costly part when doing changes to the index is searching the graph layers to find which parts to change. This is handled by multiple threads, controlled by the feeding concurrency setting.
  • Multi-threaded Searching - Several threads can be used per query. See Scaling latency in a content group for details.

The query examples in nearest neighbor search uses the brute force variant which has perfect accuracy but which is computationally expensive for large document volumes as the distance needs to be calculated for every document which matches the query filters in the query. To enable fast approximate matching we need to alter our document schema and add an index section to our tensor definitions. The same document type can have multiple HNSW indexes (one per tensor field):

field image_sift_encoding type tensor<float>(x[128]) {
  indexing: summary | attribute | index
  attribute {
    distance-metric: euclidean 
  }
  index {
    hnsw {
      max-links-per-node: 16
      neighbors-to-explore-at-insert: 500
    }
  }
}

field title_bert_embedding type tensor<float>(x[768]) {
  indexing: summary | attribute | index
  attribute {
    distance-metric: angular
  }
  index {
    hnsw {
      max-links-per-node: 24
      neighbors-to-explore-at-insert: 500
    }
  }
}

In the example above we enable fast approximate search by building an HNSW index over the image_sift_encoding and the title_bert_embedding tensor fields. We use different distance-metric and max-links-per-node settings for the two tensor fields. See HNSW index reference for details on the index parameters. Choosing the value of these parameters affects both accuracy, query speed and also indexing performance.

The tensors used in queries must have their type declared in a query profile in the application package: See query feature types.

Using fast approximate nearest neighbor search in query

With an HNSW index enabled on the tensor field we can chose if we want to use approximate or exact matching by using the approximate annotation:

{
  "yql": "select title,price from sources * where ([{\"targetHits\": 10, \"approximate\":false}]nearestNeighbor(image_sift_encoding,query_vector_sift)) and in_stock = true;",
  "hits": 100
  "ranking.features.query(query_vector_sift): [0.21,0.12,....],
  "ranking.profile": "image_similarity" 
}

By default, approximate is true when searching a tensor with a HNSW index. The approximate parameter allows us to compute the accuracy loss by having the ability to compute the exact neighbors and compare with the approximation to compute the recall@k. In addition to targetHits, there is a hnsw.exploreAdditionalHits parameter which controls how many extra nodes in the graph (in addition to targetHits) that should be explored before selecting the best closest k hits. Using a greater number gives better quality (accuracy as compared with the brute force) but worse performance.

{
  "yql": "select title,price from sources * where ([{\"targetHits\": 10, \"hnsw.exploreAdditionalHits\":100}]nearestNeighbor(image_sift_encoding,query_vector_sift)) and in_stock = true;
  "hits": 100
  "ranking.features.query(query_vector_sift): [0.21,0.12,....],
  "ranking.profile": "image_similarity" 
}

Nearest Neighbor Search Considerations

  • When using the nearestNeighbor operator the total hit count is not accurate. This is similar to the behavior with weakAnd and wand. This is true for both approximate and brute force.
  • When using the nearestNeighbor operator the grouping count is not accurate.
  • Changing the distance-metric for a tensor field with an HNSW index requires re-starting. Similar, changing the max-links-per-node and neighbors-to-explore-at-insert requires re-starting.
  • Vespa tensor fields are in-memory data structures and so is the HNSW index. For large datasets the primary memory footprint driver is the raw tensor field.