A collection of configuration parameters to tune the Container as used in Vespa.
Some configuration parameters have native
services.xml support
while others are configured through
generic config overrides.
Container worker threads
The container uses multiple thread pools for its operations.
Most components including request handlers use the container's default thread pool,
which is controlled by a shared executor instance.
Any component can utilize the default pool by injecting an java.util.concurrent.Executor instance.
Some built-in components have dedicated thread pools - such as the Jetty server and the search handler.
These thread pools are injected through special wiring in the config model and
are not easily accessible from other components.
The thread pools are by default scaled on the system resources as reported by the JVM
(Runtime.getRuntime().availableProcessors()).
It's paramount that the -XX:ActiveProcessorCount/jvm_availableProcessors
configuration is correct for the container to work optimally.
The default thread pool configuration can be overridden through services.xml.
We recommend you keep the default configuration as it's tuned to work across a variety of workloads.
Note that the default configuration and pool usage may change between minor versions.
The container will pre-start the minimum number of worker threads,
so even an idle container may report running several hundred threads.
The thread pool is pre-started with the number of thread specified in the threads parameter.
Note that tuning the capacity upwards increases the risk of high GC pressure
as concurrency becomes higher with more in-flight requests.
The GC pressure is a function of number of in-flight requests, the time it takes to complete the request
and the amount of garbage produced per request.
Increasing the queue size will allow the application to handle shorter traffic bursts without rejecting requests,
although increasing the average latency for those requests that are queued up.
Large queues will also increase heap consumption in overload situations.
Extra threads will be created once the queue is full (when boost is specified), and are destroyed after an idle timeout.
If all threads are occupied, requests are rejected with a 503 response.
The effective thread pool configuration and utilization statistics can be observed through the
Container Metrics.
See Thread Pool Metrics for a list of metrics exported.
Note:
If the queue size is set to 0 the metric measuring the queue size -
jdisc.thread_pool.work_queue.size - will instead switch to measure how many threads are active.
Lower limit
The container will override any configuration if the effective value is below a fixed minimum. This is to
reduce the risk of certain deadlock scenarios and improve concurrency for low-resource environments.
Minimum 8 threads.
Minimum 650 queue capacity (if queue is not disabled).
Example
<containerid="container"version="1.0"><search><!-- Search handler thread pool --><threadpool><threadsboost="12">4</threads><queue>100</queue></threadpool></search><!-- Default thread pool --><configname="container.handler.threadpool"><maxthreads>200</maxthreads></config></container>
JVM heap size
Change the default JVM heap size settings used by Vespa to better suit
the specific hardware settings or application requirements.
By setting the relative size of the total JVM heap in
percentage of available memory,
one does not know exactly what the heap size will be,
but the configuration will be adaptable
and ensure that the container can start
even in environments with less available memory.
The example below allocates 50% of available memory on the machine to the JVM heap:
Use gc-options for controlling GC related parameters
and options for tuning other parameters.
See reference documentation.
Example: Running with 4 GB heap using G1 garbage collector and using NewRatio = 1
(equal size of old and new generation) and enabling verbose GC logging (logged to stdout to vespa.log file).
The default heap size with docker image is 1.5g which can for high throughput applications be on the low side,
causing frequent garbage collection.
By default, the G1GC collector is used.
Config Server and Config Proxy
The config server and proxy are not executed based on the model in services.xml.
On the contrary, they are used to bootstrap the services in that model.
Consequently, one must use configuration variables
to set the JVM parameters for the config server and config proxy.
They also need to be restarted (services in the config proxy's case) after a change,
but one does not need to vespa prepare
or vespa activate first. Example:
Some applications observe that the first queries made to a freshly started container
take a long time to complete.
This is typically due to some components performing lazy setup of data structures or connections.
Lazy initialization should be avoided in favor of eager initialization in component constructor,
but this is not always possible.
A way to avoid problems with the first queries in such cases
is to perform warmup queries at startup.
This is done by issuing queries from the constructor of the Handler of regular queries.
If using the default handler,
com.yahoo.search.handler.SearchHandler,
subclass this and configure your subclass as the handler of query requests in services.xml.
Add a call to a warmupQueries() method as the last line of your handler constructor.
The method can look something like this:
Since these queries will be executed before the container starts accepting external queries,
they will cause the first external queries to observe a warmed up container instance.
Use metrics.ignore
in the warmup queries to eliminate them from being reported in metrics.