• [+] expand all

Docker containers

This document describes tuning and adaptions for running Vespa Docker containers, for developer use on laptop, and in production.

Mounting persistent volumes

The quick start and AWS ECS multi node guides show how to run Vespa in Docker containers. In these examples, all the data is stored inside the container - the data is lost if the container is deleted. When running Vespa inside Docker containers in production, volume mappings to the parent host should be added to persist data and logs.

  • /opt/vespa/var
  • /opt/vespa/logs
$ mkdir -p /tmp/vespa/var;  export VESPA_VAR_STORAGE=/tmp/vespa/var
$ mkdir -p /tmp/vespa/logs; export VESPA_LOG_STORAGE=/tmp/vespa/logs
$ docker run --detach --name vespa --hostname vespa-container --volume $VESPA_VAR_STORAGE:/opt/vespa/var \
  --volume $VESPA_LOG_STORAGE:/opt/vespa/logs --publish 8080:8080 vespaengine/vespa

System limits

When Vespa starts inside Docker containers, the startup scripts will set system limits. Make sure that the environment starting the Docker engine is set up in such a way that these limits can be set inside the containers.

For a CentOS/RHEL base host, Docker is usually started by systemd. In this case, LimitNOFILE, LimitNPROC and LimitCORE should be set to meet the minimum requirements in system limits.

In general when using Docker or Podman to run Vespa the --ulimit option should be used to set limits according to system limits. The --pids-limit should be set to unlimited (-1 for Docker and 0 for Podman).

Controlling which services to start

The Docker image vespaengine/vespa takes a parameter that controls which services are started inside the container.

Starting a configserver container:

$ docker run <other arguments> \
  --env VESPA_CONFIGSERVERS=<comma separated list of config servers> \
  vespaengine/vespa configserver

Starting a services container (configserver will not be started):

$ docker run <other arguments> \
  --env VESPA_CONFIGSERVERS=<comma separated list of config servers> \
  vespaengine/vespa services

Starting a container with both configserver and services:

$ docker run <other arguments> \
  --env VESPA_CONFIGSERVERS=<comma separated list of config servers> \
  vespaengine/vespa

This is required in the case where the configserver container should run other services like an adminserver or logserver (see services.html)

If the VESPA_CONFIGSERVERS environment variable is not specified it will be set to the container hostname.

Use the multinode-HA sample application as a blueprint for how to set up config servers and services.

Graceful stop

Stopping a running vespaengine/vespa container triggers a graceful shutdown, which saves time when starting the container again (i.e. data structures are flushed). If the container is shutdown forcefully, the content nodes might need to restore the state from the transaction log which might be time-consuming. There is no chance of data loss or data corruption as the data is always written and sync'ed to persistent storage.

The default timeout for the Docker daemon to wait for the shutdown might be too low for larger number of documents per node. Below stop will wait at least 120 seconds before terminating the running container forcefully, if the stop is successfully performed before the timeout has passed the command takes less than the timeout.

$ docker stop name -t 120

It is also possible to configure the default Docker daemon timeout, see --shutdown-timeout.

Memory

The sample applications and getting started guides indicates the minimum memory requirements for the Docker containers.

As a rule of thumb, a single-node Vespa application requires minimum 4G for the Docker container. Using docker stats can be useful to track memory usage:

$ docker stats

CONTAINER ID   NAME      CPU %     MEM USAGE / LIMIT    MEM %     NET I/O           BLOCK I/O        PIDS
589bf5801b22   node0     213.25%   697.3MiB / 3.84GiB   17.73%    14.2kB / 11.5kB   617MB / 976MB    253
e108dde84679   node1     213.52%   492.7MiB / 3.84GiB   12.53%    15.7kB / 12.7kB   74.3MB / 924MB   252
be43aacd0bbb   node2     191.22%   497.8MiB / 3.84GiB   12.66%    19.6kB / 21.6kB   64MB / 949MB     261

It is not necessarily easy to verify that Vespa has started all services successfully. Symptoms of errors due to insufficient memory vary, depending on where it fails. Example: Inspect restart logs in a container named vespa, running the quickstart with only 2G:

$ docker exec -it vespa sh -c "/opt/vespa/bin/vespa-logfmt -S config-sentinel -c sentinel.sentinel.service"

INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 2.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 6.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 14.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 30.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: will delay start by 25.173 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 62.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 126.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: will delay start by 119.515 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 254.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 510.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: will delay start by 501.026 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 1022.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: incremented restart penalty to 1800.000 seconds
INFO    : config-sentinel  sentinel.sentinel.service	container: will delay start by 1793.142 seconds

Observe that the container service restarts in a loop, with increasing pause.

A common problem is Config Servers not starting or running properly due to lack of memory. This manifests itself as nothing listening on 19071, or deployment failures.

Some guides / sample applications have specific configuration to minimize resource usage. Example from multinode-HA:

$ docker run --detach --name node0 --hostname node0.vespa_net \
    -e VESPA_CONFIGSERVERS=node0.vespa_net,node1.vespa_net,node2.vespa_net \
    -e VESPA_CONFIGSERVER_JVMARGS="-Xms32M -Xmx128M"  \
    -e VESPA_CONFIGPROXY_JVMARGS="-Xms32M -Xmx32M" \
    --network vespa_net \
    --publish 19071:19071 --publish 19100:19100 --publish 19050:19050 --publish 20092:19092 \
    vespaengine/vespa

Here VESPA_CONFIGSERVER_JVMARGS and VESPA_CONFIGPROXY_JVMARGS are tweaked to the minimum for a functional test only.

Container memory setting are done in services.xml, example from multinode-HA:

<container id="query" version="1.0">
    <nodes>
        <jvm options="-Xms32M -Xmx128M"/>
        <node hostalias="node6" />
        <node hostalias="node7" />

Make sure that the settings match the Docker container Vespa is running in.