Vespa Search Performance Tuning

This document describes how to tune an application for high performance, while search sizing guide is about scaling an application.

Attribute v.s index

The attribute documentation summaries when to use attribute in the indexing statement. Adding attribute:fast-search will speed up searches over attribute fields, by building an in-memory index over the values in the attribute field.

field timestamp type long {
    indexing:  summary | attribute
    attribute: fast-search
    rank:      filter
If both index and attribute is configured for string type fields, Vespa will do search and matching against the index with default match text.

Indexing strings

When configuring string type fields with index, the default match mode is text. This means Vespa will tokenize the content and index the tokens. Example document definition:

search foo {
    document foo {
        field title type string {
            indexing: summary | index
        field uuid type string {
            indexing: summary | index
The string representation of an Universally unique identifier (UUID) is 32 hexadecimal (base 16) digits, in five groups, separated by hyphens, in the form 8-4-4-4-12, for a total of 36 characters (32 alphanumeric characters and four hyphens).

Example: Indexing 123e4567-e89b-12d3-a456-426655440000 with the above document definition, Vespa will tokenize this into 5 tokens: [123e4567,e89b,12d3,a456,426655440000].

Phrase search is evaluated over positional indicies and has a higher cost compared to searching for a single word term. Vespa creates implicit phrases when terms are joined by hyphens. Hence /search/?query=uuid:123e4567-e89b-12d3-a456-426655440000 becomes a phrase query: uuid:"123e4567 e89b12d3 a456 426655440000".

Change the mode to match: word to disable tokenization. This stores the input 123e4567-e89b-12d3-a456-426655440000 as one token and avoids implicit phrase search:

field uuid type string {
    indexing: summary | index
    match:    word
    rank:     filter
Also configure uuid as a rank: filter field - the field will then be represented as efficient as possible during search and ranking.

Summary: Review the string fields in the application:

  • tokenized matching or not
  • used in ranking or not
The rank:filter behavior can also be triggered at query time on a per query item basis by the in a custom searcher.

Parent child and search performance

When searching imported attribute fields from parent document types there is an additional cost penalty which can be reduced significantly if the imported field is defined with rank:filter and visibility-delay is configured to > 0.


Vespa scales with the number of hits the query recalls which needs to be ranked per node. The ranking cost per document recalled is determined by the complexity of the ranking expression in use and the rank feature complexity.

Document summaries

If queries request many hits from a few content nodes, a summary cache might reduce cost.

Document summaries can be memory-only operations is all fields are attributes. Use a summary class to request attribute fields only.

Boolean, numeric, text attribute

When selecting attribute field type, considering performance, this is a rule of thumb:

  1. Use boolean if a field is a boolean (max two values)
  2. Use a string attribute if there is a set of values - only unique strings are stored
  3. Use a numeric attribute for range searches
Refer to attribute memory usage for details.