This is the HTML version of the man page for the command line tool vespa-fbench - a tool used for benchmarking the capacity of a Vespa system. Description:
vespa-fbench [-n numClients] [-c cycleTime] [-l limit] [-i ignoreCount] [-s seconds] [-q queryFilePattern] [-o outputFilePattern] [-r restartLimit] [-m maxLineSize] [-p seconds] [-k] [-x] [-y] [-z] <hostname> <port>
Some options described below, use vespa-fbench -h for a complete list:
|-n numClients||Run vespa-fbench with numClients clients in parallel. If not specified, vespa-fbench will use a default value of 10 clients.|
|-c cycleTime||each client will make a request each <num> milliseconds  ('-1' -> cycle time should be twice the response time)|
|-l limit||minimum response size for successful requests |
|-i ignoreCount||do not log the <num> first results. -1 means no logging |
|-s seconds||run the test for <num> seconds. -1 means forever |
|-q queryFilePattern||pattern defining input query files ['query%03d.txt'] (the pattern is used with sprintf to generate filenames)|
|-o outputFilePattern||save query results to output files with the given pattern (default is not saving.)|
|-r restartLimit||number of times to re-use each query file. -1 means no limit [-1]|
|-m maxLineSize||max line size in input query files . Can not be less than the minimum .|
|-p seconds||print summary every <num> seconds. only available when installing vespa-fbench from test branch,|
|-k||enable HTTP keep-alive.|
|-x||write benchmarkdata-reporting to output file.|
|-y||write data on coverage to output file (must used with -x).|
|-z||use single query file to be distributed between clients.|
|-P||use POST for requests instead of GET|
Fbench uses query files, which are files where each line is a query following the pattern:
It is often desirable to collect queries from a live system
Adding "&nocache" to each query will force the search container to request results from the search nodes. This parameter is recommended if the query set for benchmarking is small, otherwise the benchmark results will be biased by the caching performance of the container.
It is recommended to run multiple tests by changing the value in
-n parameter to measure how the system might sustain
query load and still meet the QPS and/or latency requirements. For
example, starting from 1 client, 10, 20, etc.
A typical vespa-fbench command looks like:
$ vespa-fbench -n 10 -q query%03d.txt -s 300 -c 0 -o output%03d.txt -xy test.domain.com 8080
This creates 10 clients which will run for 300 seconds (5 minutes). The
-c parameter states that each client will wait for 0
milliseconds between each request. Each client would use a query and
output file given by the given pattern and it's client number,
i.e. client 1 will use query file query001.txt and output file output001.txt.
-xy makes vespa-fbench clients output
benchmarking data to it's output files
It is possible to list several hostnames and ports. The different hostnames will be distributed to the clients in a round-robin manner, such that, with two hosts, client 0, 2, …, 38 would make requests to the first host while client 1, 3, …, 39 would make requests to the second host.
Note that saving all responses to disk might impact the performance of the benchmarking itself. If only the summary is needed it is recommended to not use output files.
After running vespa-fbench you will have a summary written to stdout and an output file from each client.
After a test run has completed, vespa-fbench outputs various test results. This section will explain what each of these numbers mean. Notes:
|connection reuse count||This value indicates how many times HTTP connections were reused to issue another request. Note that this number will only be displayed if the -k switch (enable HTTP keep-alive) is used.|
|clients||Echo of the -n parameter.|
|cycle time||Echo of the -c parameter.|
|lower response limit||Echo of the -l parameter.|
|skipped requests||Number of requests that was skipped by vespa-fbench. vespa-fbench will typically skip a request if the line containing the query url exceeds a pre-defined limit. Skipped requests will have minimal impact on the statistical results.|
|failed requests||The number of failed requests. A request will be marked as failed if en error occurred while reading the result or if the result contained less bytes than 'lower response limit'.|
|successful requests||Number of successful requests. Each performed request is counted as either successful or failed. Skipped requests (see above) are not performed and therefore not counted.|
|cycles not held||Number of cycles not held. The cycle time is specified with the -c parameter. It defines how often a client should perform a new request. However, a client may not perform another request before the result from the previous request has been obtained. Whenever a client is unable to initiate a new request 'on time' due to not being finished with the previous request, this value will be increased.|
|minimum response time||The minimum response time. The response time is measured as the time period from just before the request is sent to the server, till the result is obtained from the server.|
|maximum response time||The maximum response time. The response time is measured as the time period from just before the request is sent to the server, till the result is obtained from the server.|
|average response time||The average response time. The response time is measured as the time period from just before the request is sent to the server, till the result is obtained from the server.|
|X percentile||The X percentile of the response time samples; a value selected such that X percent of the response time samples are below this value. In order to calculate percentiles, a histogram of response times is maintained for each client at runtime and merged after the test run ends. If a percentile value exceeds the upper bound of this histogram, it will be approximated (and thus less accurate) and marked with '(approx)'.|
|actual query rate||The average number of queries per second; QPS.|
|utilization||The percentage of time used waiting for the server to complete (successful) requests. Note that if a request fails, the utilization will drop since the client has 'wasted' the time spent on the failed request.|
|zero hit queries||The number of queries that gave zero hits in Vespa|
This results will be added to the output file if the
-x switch is active(activate benchmarkdata-reporting) is used.
|NumHits||Number of hits returned|
|NumFastHits||Number of actual document hits returned|
|TotalHitCount||Total number of hits for query|
|QueryHits||Hits as specified in query|
|QueryOffset||Offset as specified in query|
|NumErrors||Number of error hits returned|
|NumGroupHits||Number of grouping hits returned|
|SearchTime||Time used for searching. Entire query time for one phase search, first phase for two-phase search|
|AttributeFetchTime||Time used for attribute fetching, or 0 for one phase search|
|FillTime||Time used for summary fetching, or 0 for one phase search|
-yActivate additional data on coverage
This uses the report coverage query feature.
These results will be added to the output file if the
-y switch is active.
|DocsSearched||Total number of documents in nodes searched|
|NodesSearched||Total number of search nodes which were used|
|FullCoverage||1 if true, 0 if false|