A collection of configuration parameters to tune the Container as used in Vespa.
The worker threads for e.g. search queries is handled by a central Executor provider which is set up with the configuration named threadpool. The variable for controlling the max number is an integer variable named maxthreads, the default value is 500. Do note e.g. federation will spawn off extra worker threads in addition to this number. The Executor provider will pre-start the worker threads, so even an idle container will report running several hundred threads. Tuning this value is especially relevant if the log starts being filled with warnings about no available worker threads on the docproc, search or processing container. Example:
<container> ... <config name="container.handler.threadpool"> <maxthreads>2000</maxthreads> </config> ... </container>Handlers which are injected with an Executor in their constructor will share threads from the same threadpool.
Document processing throughput is normally a factor of the code and how many threads used. If the process uses too much memory, too many threads in use can be the root cause. Write code that is efficient and especially does not wait - if needed, use Progress.LATER. Number of threads defaults to number of CPU cores - to override, set numthreads:
<container> <config name="config.docproc.docproc"> <numthreads>96</numthreads> </config> <document-processing> <cluster name="default"> <nodes> <node hostalias="node1"/> </nodes> </cluster> </document-processing> </container>
Change the default JVM heap size settings used by Vespa to better suit the specific hardware settings or application requirements. Vespa provides two short-hand config parameters for tuning the total JVM heap size settings; absolute size and as a percentage of available memory on the machine.
By setting the absolute size of the total JVM heap in MB, one gets specific control of the size. The disadvantage is that the configuration now strictly depends on running environment not significantly changing with regards to available memory. The example below will allocate a 2GB total heap:
<container> ... <config name="search.config.qr-start"> <jvm> <heapsize>2048</heapsize> </jvm> </config> ... </container>
By setting the relative size of the total JVM heap in percentage of available memory, one does not know exactlywhat the heap size will be, but the configuration will be adaptable and ensure that the container can start even in environments with less available memory. The example below allocates 60% of available memory on the machine to the JVM heap:
<container> ... <config name="search.config.qr-start"> <jvm> <heapSizeAsPercentageOfPhysicalMemory>60</heapSizeAsPercentageOfPhysicalMemory> </jvm> </config> ... </container>
Some of the Vespa services are implemented in Java. Setting the correct settings for memory usage etc. is non-trivial, and the Vespa defaults are most often good for production - not a development machine. This is not an extensive guide on how to tune the JVM, but rather a list of pointers of where to look and change.
If things work when running Vespa, skip reading here. However, if parts of Vespa fail at startup or while running, and it might be memory related, look below for hints. Example vespa.log entries:
1222175132.631524 mynode.mydomain.com 12213 config-sentinel config-sentinel.service event stopped/1 name="qrserver" pid=15414 exitcode=32768 1222175132.631583 mynode.mydomain.com 12213 config-sentinel config-sentinel.service error qrserver: Attempted to start, but fork() failed: Cannot allocate memory ... 1222175132.52427 mynode.mydomain.com 15381 logserver stdout info Java HotSpot(TM) 64-Bit Server VM warning: Attempt to deallocate stack guard pages failed. 1222175132.52480 mynode.mydomain.com 15381 logserver stdout info Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack guard pages failed. 1222175132.036 mynode.mydomain.com -/10 - ADM.com.yahoo.logserver.handlers.HandlerThread config logserver.queue.size=200 1222175132.52521 mynode.mydomain.com 15381 logserver stderr warning Exception in thread "main" java.lang.OutOfMem oryError: unable to create new native thread ... 1222175130.810528 mynode.mydomain.com 14314 logserver stderr warning dl failure on line 685 Error: failed $VESPA_HOME/libexec64/jdk1.6.0/jre/lib/amd64/server/libjvm.so, because $VESPA_HOME/libexec64/jdk1.6.0/jre/lib/amd64/server/libjvm.so: failed to map segment from shared object: Cannot allocate memoryA frequent case is trying to run some a sample Vespa application, configured with all services, on a development machine - this consumes gigabytes of memory and might fail.
Set JVM parameters for Java services in services.xml. When modified, vespa-deploy prepare, vespa-deploy activate and restart the relevant services. Example: Run two containers with concurrent garbage collection:
<container id="default" version="1.0"> <nodes jvmargs="-XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+CMSIncrementalPacing"> <node hostalias="node0" /> <node hostalias="node1" /> </nodes> </container>Example: Run the Log Server with verbose garbage collection:
<logserver hostalias="node0" jvmargs="-verbose:gc"/>Use the use the jvmargs attribute on:
The config server and proxy are not executed based on the model in services.xml. On the contrary, they are used to bootstrap the services in that model. Consequently, one must use configuration variables to set the JVM parameters for the config server and config proxy. They also need to be restarted (services in the config proxy's case) after a change, but one does not need to vespa-deploy prepare or vespa-deploy activate first. Example:
cloudconfig_server.jvmargs -verbose:gc services.jvmargs_configproxy -verbose:gc -Xmx256m
Refer to Setting Vespa variables.
Sometimes it is necessary to turn off a Vespa default setting to make custom settings take effect, for instance when choosing another GC algorithm. First disable the Vespa setting, then enable the custom setting. Assuming a Vespa release has CMS as default setting which should be overridden:
-XX:-UseConcMarkSweepGC -XX:+UseOtherGCAlgoI.e., use the standard JVM option syntax for turning on and off boolean flags by prefixing the actual option name with plus and minus. The user defined parameters will always be appended after factory settings, this to ensure straightforward overrides will automatically take precedence over factory settings.
By adding -XX:+PrintFlagsFinal to the JVM parameters, the JVM will dump (to the log) the final value of all flags. Be warned, though, as there are many hundred tunable flags in the JVM.
A few links about JVM tuning: