Vespa Cloud This page's content is applicable to Vespa Cloud. Self-managed This page's content is applicable to self-managed Vespa systems.

services.xml

services.xml specifies the clusters an application should have and their capabilities. It is placed in the root of the application package.

Elements:

services [version]
  container [version] - specifies a container cluster
  content   [version] - specifies a content cluster
  admin     [version] - control plane configuration (rarely needed)
  routing   [version] - how content should be routed (rarely needed)

<services>

AttributeRequiredValueDefaultDescription
version required number 1.0 in this version of Vespa

Optional subelements (one or more of container or content is required):

The rest of this document describes tags that are used within multiple services tags.

<nodes>

The nodes element configures the hardware resources of a cluster, and so is used in both container and content clusters. This tag works differently on Vespa Cloud and self-managed instances:

  • Vespa Cloud: The number of nodes are specified by a count attribute, and the resources of each node by a resource child element.
  • Self-managed: nodes have a node child element for each node, A node referred to in services.xml must be defined in hosts.xml using hostalias.

It is possible to specify both to make an application package work in both environments, and it is always possible to deploy either type for development on the other: When the nodes tag has Vespa Cloud content it is interpreted as a single-node cluster in a self-hosted environment and vice versa.

Attribute type Default Description
count integer or range Vespa Cloud: The number of nodes of the cluster.
exclusive boolean false Optional. Vespa Cloud: If true these nodes will never be placed on shared hosts even when this would otherwise be allowed (which is only for content nodes in some environments). When nodes are allocated exclusively, the resources must match the resources of the host exactly.
groups integer or range Vespa Cloud content nodes only, optional: Integer or range. Sets the number of groups into which content nodes should be divided. Each group will have an equal share of the nodes, and one or more complete copies of the corpus and index, and each query will be routed to just one group - see grouped distribution. This allows scaling to a higher query load than is possible with just a single group.
group-size integer or range Vespa Cloud content nodes only, optional: Integer or range where either value can be skipped (replaced by an empty string) to create a one-sided limit. This can be set as an alternative to explicitly setting groups: The group sizes used will always be within these limits (inclusive), for any count.

If neither groups nor group-size is set, all nodes belong to a single group. Read more in topology.

Ranges are expressed by the syntax [lower-limit, upper-limit]; Both limits are inclusive. Any value set as a range will be autoscaled.

<resources>

Under nodes on Vespa Cloud: Specifies the resources each node in the cluster should have.

The resources must match a node flavor in AWS, GCP Azure, depending on where you are deploying. Exception: If you use remote disk, you can specify any number lower than the max size.

Subelements: <gpu>

Attribute type Default Description
vcpu float or range 2 CPU (virtual threads)
memory float or range, each followed by a byte unit, such as "Gb" 8 Gb in container clusters, 16 Gb in content clusters Memory
disk float or range, each followed by a byte unit, such as "Gb" 50 in container clusters, 300 in content clusters Disk space. To fit core dumps/heap dumps, the disk space should be larger than 3 x memory size for content nodes, 2 x memory size for container nodes.
storage-type string (enum) any The type of storage to use. This is useful to specify local storage when network storage provides insufficient io operations or too noisy io performance:
  • local: Node-local storage is required.
  • remote: Network storage must be used.
  • any: Both remote or local storage may be used.
disk-speed string (enum) fast The required disk speed category:
  • fast: SSD-like disk speed is required
  • slow: This is sized for spinning disk speed
  • any: Performance does not depend on disk speed (often suitable for container clusters).
architecture string (enum) any Node CPU architecture:
  • x86_64
  • arm64
  • any: Use any of the available architectures.

Ranges are expressed by the syntax [lower-limit, upper-limit]; Both limits are inclusive. Any value set as a range will be autoscaled.

<node>

Under nodes on self-managed systems: Specifies a node that should be a member in the cluster.

AttributeRequiredValueDefaultDescription
hostalias required string

a host name which must be mapped to a full hostname in hosts.xml

<gpu>

Under resources on Vespa Cloud: Declares GPU resources to provision.

Limitations:

  • Available in AWS zones only
  • Valid for container clusters only
AttributetypeDescription
count integer Number of GPUs
memory integer, followed by a byte unit, such as "Gb" Amount of memory per GPU. Total amount of GPU memory available is this number multiplied by count.

Example:

<nodes count="2">
    <resources vcpu="4" memory="16Gb" disk="125Gb">
        <gpu count="1" memory="16Gb"/>
    </resources>
</nodes>

Generic configuration using <config>

Most elements in services.xml accept a sub-element named config. config elements can be included on different levels in the XML structure and the lower-level ones will override values in the higher-level ones (example below). The config element must include the attribute name, which gives the full name of the configuration option in question, including the namespace. The name can either refer to configuration definitions that are shipped with Vespa or ones that are part of the application package. For a complete example on generic configuration see the application package reference.

<container id="default" version="1.0">
    <handler id="com.yahoo.vespatest.ConfiguredHandler">
        <config name="vespatest.response">
            <response>configured string</response>
        </config>
    </handler>
</container>

Modular Configuration

Some features are configurable using XML files in subdirectories of the application package. This means that the configuration found in these XML files will be used as if it was inlined in services.xml. This is supported for search chains, docproc chains and routing tables.