Container Tuning

A collection of configuration parameters to tune the Container as used in Vespa. Some configuration parameters have native services.xml support while others are configured through generic config overrides.

Container worker threads

The worker threads for e.g. search queries is handled by a central Executor provider which is set up with the configuration named threadpool. The variable for controlling the max number is an integer variable named maxthreads, the default value is 500. Do note that e.g. federation will spawn off extra worker threads in addition to this number. The Executor provider will pre-start the worker threads, so even an idle container will report running several hundred threads. Note that tuning this upwards increases the risk of high GC pressure as concurrency becomes higher with more in-flight requests. The GC pressure is a function of number of in-flight requests, the time it takes to complete the request and the amount of garbage produced per request.

<container id="container" version="1.0">
  ...
  <config name="container.handler.threadpool">
    <maxthreads>200</maxthreads>
  </config>
  ...
</container>
Handlers which are injected with an Executor in their constructor will share threads from the same threadpool. Number of threads is active is available as metric. If all threads are occupied, requests are denied with 503.

JVM heap size

Change the default JVM heap size settings used by Vespa to better suit the specific hardware settings or application requirements.

By setting the relative size of the total JVM heap in percentage of available memory, one does not know exactly what the heap size will be, but the configuration will be adaptable and ensure that the container can start even in environments with less available memory. The example below allocates 50% of available memory on the machine to the JVM heap:

<container id="container" version="1.0">
  ...
  <nodes>
    <jvm allocated-memory="50%" />
    <node hostalias="node0" />
  </nodes>
  ...

</container>

JVM Tuning

Use gc-options for controlling GC related parameters and options for tuning other parameters. See reference documentation. Example: Running with CMS concurrent garbage collection and using NewRatio = 1 (equal size of old and new generation) and enabling verbose GC logging (Logged to stdout to vespa.log file).

<container id="default" version="1.0">
  <nodes>
    <jvm gc-options="-XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=15 -XX:NewRatio=1 -XX:+PrintGC" options="-XX:+PrintCommandLineFlags" />
    <node hostalias="node0" />
  </nodes>
</container>

Config Server and Config Proxy

The config server and proxy are not executed based on the model in services.xml. On the contrary, they are used to bootstrap the services in that model. Consequently, one must use configuration variables to set the JVM parameters for the config server and config proxy. They also need to be restarted (services in the config proxy's case) after a change, but one does not need to vespa-deploy prepare or vespa-deploy activate first. Example:

cloudconfig_server.jvmargs      -verbose:gc
services.jvmargs_configproxy    -verbose:gc -Xmx256m
Refer to Setting Vespa variables.

Container warmup

Some applications observe that the first queries made to a freshly started container take a very long time to complete. This is typically due to some components performing lazy setup of data structures or connections. Lazy initialization should be avoided in favor of eager initialization in component constructor, but this is not always possible. A way to avoid problems with the first queries in such cases is to perform warmup queries at startup. This is done by issuing queries from the constructor of the Handler of regular queries. If you are using the default handler, com.yahoo.search.handler.SearchHandler, you need to subclass this and configure your subclass as the handler of query requests in services.xml.

Add a call to a warmupQueries() method as the last line of your handler constructor. The method can look something like this:

private void warmupQueries() {
    String[] requestUris = new String[] {"warmupRequestUri1", "warmupRequestUri2"};
    int warmupIterations = 50;

    for (int i = 0; i < warmupIterations; i++) {
        for (String requestUri : requestUris) {
            handle(HttpRequest.createTestRequest(requestUri, com.yahoo.jdisc.http.HttpRequest.Method.GET));
        }
    }
}
Since these queries will be executed before the container starts accepting external queries, they will cause the first external queries to observe a warmed up container instance.