services.xml - 'content'

content [version, id, distributor-base-port]
    documents [selection, garbage-collection, garbage-collection-interval]
        document [type, selection, mode]
        document-processing [cluster, chain]
        node [baseport, hostalias, jvmargs??, preload, distribution-key, capacity]
    group [distribution-key, name]
        node [baseport, hostalias, jvmargs??, preload, distribution-key, capacity]
        group [distribution-key, name]
    dispatch DEPRECATED
        num-dispatch-groups DEPRECATED
        group DEPRECATED
            node [distribution-key] DEPRECATED
        bucket-splitting [max-documents, max-size, minimum-bits]
        distribution [type]
        maintenance [start, stop, high]
        merges [max-per-node, max-queue-size]
        persistence-threads [lowest-priority-to-block-others, highest-priority-to-block]
            thread [lowest-priority, count]
        visitors [thread-count, max-queue-size]
            max-concurrent [fixed, variable]


The root element of a Content cluster definition. Creates a content cluster. A content cluster stores and/or indexes documents. The xml file may have zero or more such tags.

Contained in services. Attributes:

  • version (required): Must be set to '1.0' in this version of Vespa.
  • id (required for multiple clusters): Name of the content cluster. If none is supplied, the cluster name will be 'content'. Cluster names must be unique within application, so if several clusters are setup, name must be set for all but one at minimum. Suggested set by everyone for cluster to have a meaningful name. Allows you to add clusters later without having to rename existing one for the names to make sense.
  • distributor-base-port (optional): If a specific port is required for access to the distributor, override it with this attribute.
Required subelements: Optional subelements:


Contained in content. Defines which document types should be routed to this content cluster using the default route, and what documents should be kept in the cluster if the garbage collector runs. Read more on expiring documents. Also have some backend specific configuration for whether documents should be searchable or not. Attributes:

selection optional string A document selection string, defaults to a selection expression matching everything - restricts the documents that are routed to this cluster. This selection can be specified to match document identifier specifics that are independent of document types. For restrictions that apply only to a specific document type, this must be done within that particular document type's <document> tag. Trying to use document type references in this selection will produce an error during deployment. The selection given here will be merged with per-document type selections specified within document tags, if any, meaning that any document in the cluster must match both selections to be accepted and kept.
garbage-collection optional true / false false If true, regularly verify the documents stored in the cluster to see if they belong in the cluster, and delete them if not. If false, garbage collection is not run.
garbage-collection-interval optional integer 3600 Time (in seconds) between garbage collection cycles.


Contained in documents. The document type to be routed to this content cluster. Attributes:

type required string Document type name
mode required index / store-only / streaming

The mode of storing and indexing. In this documentation, index is assumed unless explicitly mentioned streaming or store-only. Refer to streaming search for store-only, as documents are stored the same way for both cases.

Changing mode requires an indexing-mode-change validation override, and documents must be re-fed.

selection optional string A document selection string, defaults to a selection expression matching everything - restricts the documents that are routed to this cluster. This selection must apply to fields in this document type only. Selection will be merged together with selection for other types and global selection from documents to form a full expression for what documents belong to this cluster.
global optional true / false false

Set to true to distribute all documents of this type to all nodes. Fields in global documents can be imported into documents to implement joins - read more in parent/child. Vespa will detect when a new (or outdated) node is added to the cluster and prevent it from taking part in searches until it has received all global documents.

Changing from false to true or vice versa requires a global-document-change validation override. First, stop services on all content nodes. Then, deploy with the validation override. Finally, start services on all content nodes.

Note: global is only supported for mode="index".


Contained in documents. Vespa Search specific configuration for which document processing cluster and chain to run index pre processing. Attributes:

cluster optional string Container cluster on content node Name of a document-processing container cluster that does index pre processing. Use cluster to specify an alternative cluster, other than the default cluster on content nodes.
chain optional string indexing chain A document processing chain in the container cluster specified by cluster to use for index pre processing. The chain must inherit the indexing chain.
Example - the container cluster enables document-processing, referred to by the content cluster:
<container id="my-indexing-cluster" version="1.0">
<content id="music" version="1.0">
    <document-processing cluster="my-indexing-cluster"/>
To add document processors either before or after the indexer, declare a chain (inherit indexing) in a document-processing container cluster and add document processors. Annotate document processors with before=indexingStart or after=indexingEnd. Configure this cluster and chain as the indexing chain in the content cluster - example:
<container id="my-indexing-cluster" version="1.0">
    <chain id="my-document-processors" inherits="indexing">
      <documentprocessor id="MyDocproc">
      <documentprocessor id="MyOtherDocproc">
<content id="music" version="1.0">
    <document-processing cluster="my-indexing-cluster" chain="my-document-processors" />


Contained in content. Defines the total number of copies of each piece of data the cluster will maintain to avoid data loss. Example: with a redundancy of 2, the system tolerates 1 node failure before data becomes unavailable (until the system has managed to create new replicas on other online nodes).

redundancy can be changed without node restart.


Contained in content. Defines the set of content nodes in the cluster - parent for node-elements.


Contained in nodes or group. Configures a content node to the cluster. Attributes:

distribution-key required integer

Sets the distribution key of a node. It is not recommended to change this for a given node. It is recommended (but not required) that the set of distribution keys in the cluster are contiguous and starting at 0. Example: If the biggest distribution key is 499, then the distribution algorithm need to calculate 500 random numbers to calculate the correct target. It is hence recommended to not leave too many gaps in the distribution key range.

Distribution keys are used to identify nodes and groups for the distribution algorithm. If a node changes distribution key, the distribution algorithm regards it as a new node, hence buckets are redistributed. When merging clusters, one might need to change distribution keys - details on merging clusters.

Content nodes need unique node distribution keys across the whole cluster, as the key is also used as a node identifier where group information is not specified.

capacity optional double 1 Capacity of this node, relative to other nodes. A node with capacity 2 will get double the data and requests of a node with capacity 1.
baseport optional integer baseport
hostalias optional string hostalias
jvmargs optional string jvmargs
preload optional string preload


Contained in content or group - groups can be nested. Defines the hierarchical distribution structure of the cluster. Can not be used in conjunction with the nodes element. If a non-flat structure is desired, use this element instead. Groups can contain other groups or nodes, but not both. In Open Source Vespa when using groups, searchable-copies and redundancy becomes the total number across all leaf groups in the cluster. For groups in Vespa Cloud see separate documentation. Read more on using groups. Attributes:

distribution-key required integer Sets the distribution key of a group. It is not allowed to change this for a given group. Group distribution keys only need to be unique among groups that share the same parent group.
name required string The name of the group, used for access from status pages and the like.

There is currently no deployment-time verification that the distribution key remains unchanged for any given node or group. Consequently, take great care when modifying the set of nodes in a content cluster. Assigning a new distribution key to an existing node is undefined behavior; Best case, the existing data will be temporarily unavailable until the error has been corrected. Worst case, risk crashes or data loss.

Example with two groups, where each group has all copies of half of the data set:

<group name="top-group" distribution-key="0">
  <distribution partitions="*"/>
  <group name="bottom-1" distribution-key="0">
    <node distribution-key="0" hostalias="node1"/>
  <group name="bottom-2" distribution-key="1">
    <node distribution-key="1" hostalias="node2"/>

distribution (in group)

Contained in group. Defines the data distribution to subgroups of this group. distribution should not be in the lowest level group containing storage nodes, as here the ideal state algorithm is used directly. In higher level groups, distribution is mandatory. Attributes:

  • partitions (required, if there are subgroups in the group): String conforming to the partition specification
Partition specificationDescription
*Distribute all copies over 1 of N groups
*|*Distribute all copies over 2 of N groups
*|*|*Distribute all copies over 3 of N groups
The partition specification is used to evenly distribute content copies across groups. You must write one * per group separated by pipes (e.g. *|* for two groups).


Contained in content. Specify the content engine to use, and/or adjust tuning parameters for the engine. Allowed engines are proton and dummy, the latter being used for debugging purposes. If no engine is given, proton is used. Sub-elements: one of <proton> and <dummy>.


Contained in engine. If specified, the content cluster will use the Proton content engine. This engine supports storage, indexed search and secondary indices. Optional sub-elements are searchable-copies, tuning, flush-on-shutdown, and resource-limits.


Contained in proton. Default value is 2, or redundancy, if lower. If set to less than redundancy, only some of the stored copies are ready for searching at any time. This means that node failures causes temporary data unavailability while the alternate copies are being indexed for search. The benefit is using less memory, trading off availability during transitions. Refer to bucket move.

If updating documents or using document selection for garbage collection, consider setting fast-access on the subset of attribute fields used for this to make sure that these attributes are always kept in memory for fast access. Note that this is only useful if searchable-copies is less than redundancy.

searchable-copies can be changed without node restart.


Contained in proton. Default value is true. If set to true, search nodes will flush a set of components (e.g. memory index, attributes) to disk before shutting down such that the time it takes to flush these components plus the time it takes to replay the transaction log after restart is as low as possible. The time it takes to replay the transaction log depends on the amount of data to replay, so by flushing, some components before restart the transaction log will be pruned and we reduce the replay time significantly. Refer to Proton maintenance jobs.


Contained in proton. Specifies resource limits used by proton to reject write operations when a limit is reached. Use this to implement a feed block to avoid saturating content nodes. Elements:

disk optional float
[0, 1]
writefilter.disklimit Fraction of total space on the disk partition used before put and update operations are rejected
memory optional float
[0, 1]
writefilter.memorylimit Fraction of physical memory that can be resident memory in anonymous mapping by proton before put and update operations are rejected

Contained in content, optional. Declares search configuration for this content cluster. Optional sub-elements are query-timeout, visibility-delay and coverage.


Contained in search. Specifies the query timeout in seconds for queries against the search interface on the content nodes. The default is 5.0, the max is 600.0. For query timeout also see the request parameter timeout.

Note: You will not be able to override the configured value using the request parameter timeout.


Contained in:search. Specifies the maximum amount of time (in seconds) that should pass from a write operation is performed, to the change is visible, in search results. The default value is 0 seconds. Configuring a larger value then 0 will add a results-oriented cache at the container level where time to live (ttl) is set to the same value as the visibility-delay. Note that by increasing this value you should also expect an increase in throughput during batch feeding.

When benchmarking batch feeding for a given test set, we got the following improvements in throughput when setting visibility-delay to 4.0 seconds: +20% during initial feeding, +15% during re-feeding and +120% during removing of 1M documents. These improvements depend on how many index and attribute fields are in the search definition, the content of the documents and the visibility-delay itself. Benchmarking is required to establish the particular improvements for a given application.


Contained in:search. Declares search coverage configuration for this content cluster. Optional sub-elements are minimum, min-wait-after-coverage-factorand max-wait-after-coverage-factor.


Contained in coverage. Declares the minimum search coverage required before returning the results of a query. This number is in the range [0, 1], with 0 being no coverage and 1 being full coverage.

The default is 1; unless configured otherwise a query will not return until all search nodes have responded.


Contained in: coverage. Declares the minimum time for a query to wait for full coverage once the declared minimum has been reached. This number is a factor that is multiplied with the time remaining at the time of reaching minimum coverage.

The default is 0; unless configured otherwise a query will return as soon as the minimum coverage has been reached, and the remaining search nodes appear to be lagging.


Contained in:coverage. Declares the maximum time for a query to wait for full coverage once the declared minimum has been reached. This number is a factor that is multiplied with the time remaining at the time of reaching minimum coverage.

The default is 1; unless configured otherwise a query is allowed to wait its full timeout for full coverage even after reaching the minimum.


Since Vespa-7.109.10, this element as no effect - details.

Contained in:content, optional. Defines the multi-level structure of dispatchers (scatter-gather nodes) in this cluster. By adding this element we get a hierarchy of mid-level dispatchers, ordered in dispatch groups, with content/search nodes at the leaf level. This can be used in a system with a huge amount (hundreds) of content/search nodes where the fan-out from the top-level dispatchers causes the network to be a bottleneck.

Currently, this multi-level structure is only supported when using flat document distribution and only one level of mid-level dispatchers. Optional sub-elements are group and num-dispatch-groups.

In the following example we create 2 mid-level dispatch groups, each containing 3 content/search nodes (referenced by the distribution key of the actual nodes). Each dispatch group also consists of 3 mid-level dispatchers that will be located on the content/search node hosts. The nodes of a dispatch group will typically be located on the same physical switch in a production setup.

In this setup the top-level dispatchers will see 2 mid-level dispatch groups, and each query is passed to 1 of the 3 dispatchers in each group. The mid-level dispatchers will pass the query to all its underlying content/search nodes:

        <node distribution-key='0'/>
        <node distribution-key='1'/>
        <node distribution-key='2'/>
        <node distribution-key='3'/>
        <node distribution-key='4'/>
        <node distribution-key='5'/>

num-dispatch-groups (in dispatch) DEPRECATED

Contained in dispatch. Defines the number of dispatch groups to be used in the multi-level dispatch setup. This can be specified instead of explicit dispatch groups. In this case the content/search nodes of this cluster is automatically assigned to the specified number of dispatch groups (in the same order they are specified in this cluster).

NOTE: Should NOT be used for production.

group (in dispatch) DEPRECATED

Contained in dispatch. Defines a mid-level dispatch group in a multi-level dispatch setup. Required sub-element is node.

node (in dispatch) DEPRECATED

Contained in group, required. Defines a node in a mid-level dispatch group. This is a reference to the actual content/search node that should be part of this dispatch group. A mid-level dispatcher will also be located on the host of the content/search node. Attribute:

  • distribution-key (required): Reference to the distribution key of the actual content/search node


Contained in content, optional. Optional tuning parameters are: bucket-splitting, min-node-ratio-per-group, cluster-controller, dispatch, distribution, maintenance, merges, persistence-threads and visitors.


Contained in tuning. The bucket is the fundamental unit of distribution and management in a content cluster. Buckets are auto-split, no need to configure for most applications. Streaming search latency is linear with bucket size. Attributes:

max-documents optional integer 1024 Maximum number of documents per content bucket. Buckets are split in two if they have more documents than this. Keep this value below 16K.
max-size optional integer 32MiB Maximum size (in bytes) of a bucket. This is the sum of the serialized size of all documents kept in the bucket. Buckets are split in two if they have a larger size than this. Keep this value below 100MiB.
minimum-bits optional integer Override the ideal distribution bit count configured for this cluster. Prefer to use the distribution type setting instead if the default distribution bit count does not fit the cluster. This variable is intended for testing and to work around possible distribution bit issues. Most users should not need this option.


Contained in tuning. States a lower bound requirement on the ratio of nodes within individual groups that must be online and able to accept traffic before the entire group is automatically taken out of service. Groups are automatically brought back into service when the availability of its nodes has been restored to a level equal to or above this limit.

Elastic content clusters are often configured to use multiple groups for the sake of horizontal traffic scaling and/or data availability. The content distribution system will try to ensure a configured number of replicas is always present within a group in order to maintain data redundancy. If the number of available nodes in a group drops too far, it is possible for the remaining nodes in the group to not have sufficient capacity to take over storage and serving for the replicas they now must assume responsibility for. Such situations are likely to result in increased latencies and/or feed rejections caused by resource exhaustion. Setting this tuning parameter allows the system to instead automatically take down the remaining nodes in the group, allowing feed and query traffic to fail completely over to the remaining groups.

Valid parameter is a decimal value in the range [0, 1]. Default is 0, which means that the automatic group out-of-service functionality will not automatically take effect.

Example: assume a cluster has been configured with n groups of 4 nodes each and the following tuning config:

This tuning allows for 1 node in a group to be down. If 2 or more nodes go down, all nodes in the group will be marked as down, letting the n-1 remaining groups handle all the traffic.

This configuration can be changed live as the system is running and altered limits will take effect immediately.

distribution (in tuning)

Contained in tuning. Lets you tune the distribution algorithm used in the cluster. Attributes:

  • type (optional): loose | strict | legacy. Defaults to loose.

    When the number of a nodes configured in a system changes over certain limits, the system will automatically trigger major redistributions of documents. This is to ensure that the number of buckets is appropriate for the number of nodes in the cluster. This enum value speficies how aggressive the system should be in triggering such distribution changes.

    The default of loose strikes a balance between rarely altering the distribution of the cluster and keeping the skew in document distribution low. It is recommended that you use the default mode unless you have empirically observed that it causes too much skew in load or document distribution.

    Note that specifying minimum-bits under bucket-splitting overrides this setting and effectively "locks" the distribution in place.


Contained in tuning. Controls the running time of the bucket maintenance process. Bucket maintenance verifies bucket content for corruption. Most users should not need to tweak this. Attributes:

  • start (required): Time string in HH:MM form, e.g. 02:00 Start of daily maintenance window.
  • stop (required): Time string in HH:MM form, e.g. 05:00 End of daily maintenance window.
  • high (required): Week day name string, e.g. monday Day of week for starting full file verification cycle (more costly than partial file verification)


Contained in tuning. Defines throttling parameters for bucket merge operations. Attributes:

  • max-per-node (optional): Maximum number of parallel active bucket merge operations.
  • max-queue-size (optional): Maximum size of the merge bucket queue, before reporting BUSY back to the distributors.


Contained in tuning. Defines the number of persistence threads per partition on each content node. A content node executes bucket operations against the persistence engine synchronously in each of these threads. By default, four threads are created that can handle any priority operation, as well as two threads reserved for high priority operations. Optionally, add one or more thread elements. Attributes:

  • lowest-priority-to-block-others (optional): Priority indicator (e.g. VERY_HIGH) If an operation has equal to or higher priority than this, operations with low enough priority to be blocked will not be able to start running in other persistence threads for the same partition.
  • highest-priority-to-block (optional): Priority indicator (e.g. NORMAL_1) If an operation has a priority lower than or equal to this priority, and there are already operations being processed that have high enough priority to block others, this operation will not be started yet, even if there is a free persistence thread.


Contained in persistence-threads. Adds a number of threads to process persistence operations on each partition. Attributes

  • lowest-priority (optional): Priority indicator (e.g. NORMAL_1)

    The lowest priority operation these threads are allowed to process. Defaults to LOWEST. Note that in this context LOWEST refers to the lowest possible priority. While in the context of setting operation priority, LOWEST is the lowest user settable priority, but the content layer itself can create lower priority operations if it wants.

    Note: You should always have at least 1 thread capable of processing operations with any priority, as the priority of internal operations is undefined from the perspective of the end-user and some of these may have a very low priority (but still be important to eventually process). Failing to do so results in operations filling up partition queues that can never be performed.

  • count (optional): The number of these threads to create.


Contained in tuning. Tuning parameters for visitor operations. Might contain <max-concurrent>. Attributes:

  • thread-count (optional): The maximum number of threads in which to execute visitor operations. A higher number of threads may increase performance, but may use more memory.
  • max-queue-size (optional): Maximum size of the pending visitor queue, before reporting BUSY back to the distributors.


Contained in visitors. Defines how many visitors can be active concurrently on each storage node. The number allowed depends on priority - lower priority visitors should not block higher priority visitors completely. To implement this, specify a fixed and a variable number. The maximum active is calculated by adjusting the variable component using the priority, and adding the fixed component. Attributes:

fixed optional number 16 The fixed component of the maximum active count
variable optional number 64 The variable component of the maximum active count


Contained in tuning. Tune the query-dispatch behavior - child elements:

max-hits-per-partition optional Declares the maximum number of hits to return from a single node. By default, a query returns the requested number of hits + offset from every search node up to the container, which in turns orders them according to the query, then discards all hits beyond the number requested. In a system with a large fan-out, this can consume a lot of bandwidth. When there is sufficiently many search nodes, assuming an even distribution of the hits, it should suffice to only return some fraction of the request number of hits from each node. Note that changing this number will have global ordering impact. How much is determined by the total number of search nodes involved in the query and the magnitude of the hits/offset parameters.
dispatch-policy optional round-robin / adaptive round-robin Configure policy for choosing which group shall receive the next request. However multi-phase requests that either requires or benefits from hitting the same group in all phases are always hashed.
  • round-robin: round-robins between the groups, putting uniform load on the groups.
  • adaptive: measures latency, preferring lower latency groups, useful for heterogeneous groups.
min-group-coverage optional 100 Coverage required in order to serve from a group - default full coverage. Relevant only for grouped distribution.
min-active-docs-coverage optional 50 Percentage of active documents a group needs to have compared to average of other groups in order to be active for serving queries. Because of measurement timing differences, it is not advisable to tune this above 99 percent. Relevant only for grouped distribution.


Contained in tuning. Tuning parameters for the cluster controller managing this cluster - child elements:

init-progress-time optional If the initialization progress count have not been altered for this amount of seconds, the node is assumed to have deadlocked and is set down. Note that initialization may actually be prioritized lower now, so setting a low value here might cause false positives. Though if it is set down for wrong reason, when it will finish initialization and then be set up again.
transition-time optional storage_transition_time distributor_transition_time The transition time states how long (in milliseconds) a node will be in maintenance mode during what looks like a controlled restart. Keeping a node in maintenance mode during a restart allows a restart without the cluster trying to create new copies of all the data immediately. If the node has not started initializing or got back up within the transition time, the node is set down, in which case, new full bucket copies will be created. Note separate defaults for distributor and storage (i.e. search) nodes.
max-premature-crashes optional max_premature_crashes The maximum number of crashes allowed before a content node is permanently set down by the cluster controller. If the node has a stable up or down state for more than the stable-state-period, the crash count is reset. However, resetting the count will not reenable the node again if it has been disabled - restart the cluster controller to reset.
stable-state-period optional stable_state_time_period If a content node's state doesn't change for this many seconds, it's state is considered stable, clearing the premature crash count.
min-distributor-up-ratio optional min_distributor_up_ratio The minimum ratio of distributors that are required to be up for the cluster state to be up.
min-storage-up-ratio optional min_storage_up_ratio The minimum ratio of content nodes that are required to be up for the cluster state to be up.