Attributes

Attributes are in-memory structures for fields - use attribute when the field is used in:

Or, the other way around, use index for fields used for text search, with stemming and normalization. Find more details in the Blog search tutorial. attribute is a keyword in search definitions, specifying the indexing for a document field - see the indexing language.

Attributes speed up query execution and document updates, trading off memory. As data structures are regularly optimized, consider both static and temporal resource usage - refer to the attribute sizing guide.

Use attributes in document summaries to limit accesses to storage to generate result sets.

Using attribute for a field means query matching works on memory structures only. Note that a basic attribute is a linear array-like data structure - matching documents means scanning all attributes. Setting fast-search generates an index structure for quicker lookup, using more memory. As a rule of thumb, set fast-search on attributes used in queries, with some caveats.

Attributes and updates

A partial update can update memory structures for high throughput. The Vespa search core maintains separate data structures for the document active/inactive replicas. Make sure all attribute replicas are in memory by using fast-access, trading off memory usage. Conversely, one can save memory by not setting fast-access, if the partial update rate is low and update latency is not important.

Attributes and sizing

Attributes are always stored in memory, except when streaming search is enabled. In streaming search, attribute data is read into memory during query evaluation. Note that the document meta store is always stored in memory, regardless of indexing mode

When resizing attributes, both current and new attribute will temporarily reside in memory. Hence, increasing an attribute means more than doubling the memory used - refer to the resizing reference.

Read the attribute sizing guide.

Document meta store

The document meta store is an in-memory data structure used for bookkeeping about every document stored on a node.

The document meta store is an implicit attribute, and is compacted and flushed. Memory usage for applications with small documents can be dominated by this attribute, particularly for store-only and streaming search applications.

The document meta store scales linearly with number of documents - using approximately 30 bytes per document on disk. Hence, to estimate disk usage, feed X% of corpus and extrapolate.

$ du -sh $VESPA_HOME/var/db/vespa/search/cluster.mystream/n1/documents/doctype/0.ready/*
  4.0K	attribute
  216M	documentmetastore
  4.0K	index
  1.5G	summary
The metric content.proton.documentdb.ready.attribute.memory_usage.allocated_bytes for "field": "[documentmetastore]" is the size of the document meta store in memory - use the metric API to find the size - example:
{
  "name": "content.proton.documentdb.ready.attribute.memory_usage.allocated_bytes",
  "description": "The number of allocated bytes",
  "values": {
    "average": 4.69736008E8,
    "count": 12,
    "rate": 0.2,
    "min": 469736008,
    "max": 469736008,
    "last": 469736008
  },
  "dimensions": {
    "documenttype": "doctype",
    "field": "[documentmetastore]"
  }
},
In the above example, the node has 9M ready documents with 52 bytes in memory per document.

Note: the above is for the ready (i.e. indexed) documents - also check removed and notready metrics. For more information on what these different document categories mean for a search node, please see the document sub database documentation.