• [+] expand all

Query API Reference

Refer to the Query API guide for API examples.

All the request parameters listed below can be set in query profiles. The first four blocks of properties are also modeled as query profile types. These types can be referred from query profiles (and inheriting types) to provide type checking on the parameters.

These parameters often have both a full name - including the path from the root query profile - and one or more abbreviated names. Both names can be used in requests, while only full names can be used in query profiles. The full names are case-sensitive, abbreviated names are case-insensitive.

The parameters modeled as query profiles are also available through get methods as Java objects from the Query to Searcher components.

Parameters

Query
Native Execution Parameters
Query Model
Ranking
Presentation
Grouping
Streaming
Tracing
Semantic Rules
Dispatch
Other

Query

Parameter Alias Type Default Description
yql String

See the YQL query guide for examples, and the reference for details.

Native Execution Parameters

These parameters are defined in the native query profile type.

Parameter Alias Type Default Description
hits count Number 10

A positive integer, including 0. The maximum number of hits to return from the result set.

hits is capped at maxHits, default 400. maxHits can be set in a query profile.

Number of hits can also be set in YQL.

offset start Number 0

To implement pagination: The number of hits to skip when returning the result. A positive integer, including 0.

offset is capped at maxOffset, default 1000. maxOffset can be set in a query profile.

Offset can also be set in YQL.

queryProfile String default

A query profile id with format name:version, where version can be omitted or partially specified, e.g. myprofile:2.1. A query profile has default properties for a query. The default query profile is named default.

groupingSessionCache Boolean true

Set to true to enable grouping session cache. See the grouping reference for details.

searchChain String default

A search chain id with format name:version, where version can be omitted or partially specified, e.g. mychain:2.1.3. The search chain initially invoked when processing the query. This search chain may invoke other chains.

timeout String 0.5s

Positive floating point number with an optional unit. Default unit is seconds (s), valid unit strings are e.g. ms and s. To set a timeout of one minute, the argument could be set to 60 s. Space between the number and the unit is optional.

It specifies the overall timeout of the query execution and can be defined in a query profile. Different classes of queries can then easily have a different latency budget/timeout using different profiles.

At timeout, the hits generated thus far are returned, refer to ranking.softtimeout.enable for details on HTTP status codes and response elements.

Refer to the Query API guide for more details on timeout handling.

Query Model Parameters

Parameter Alias Type Default Description
model.defaultIndex default-index String default

An index name. The field which is searched for query terms which doesn't explicitly specify an index. Also see the defaultIndex query annotation.

model.encoding encoding String utf-8

Encoding names or aliases defined in the IANA character sets. Sets the encoding to use when returning a result. The query is always encoded as UTF-8, independently of how the result will be encoded.

The encodings big5, euc-jp, euc-kr, gb2312, iso-2022-jp and shift-jis also influences how tokenization is done in the absence of an explicit language setting.

model.filter filter String

A filter string in the Simple Query Language. Sets a filter to be combined with the model.queryString. Typical use of a filter is to add machine generated or preferences based filter terms to the user query. Terms which are passed in the filter are not bolded. The filter is parsed the same way as a query of type any, the full syntax is available. The positive terms (preceded by +) and phrases act as AND filters, the negative terms (preceded by -) act as NOT filters, while the unprefixed terms will be used to RANK the results. Unless the query has no positive terms, the filter will only restrict and influence ranking of the result set, never cause more matches than the query.

model.locale locale String

A language tag from RFC 5646. Sets the locale and language to use when parsing queries from a language tag, such as en-US. This attribute should always be set when it is known. If this parameter is not set, it will be guessed from the query and encoding, and default to english if it cannot be guessed.

model.language lang, language String

A language tag from RFC 5646, but allowing underscore instead of dash as separator character. A legacy alternative to locale. When this value is accessed, underscores will be replaced by dashes in the returned value. Also see the language query term annotation.

model.queryString query String

A query string in the Simple Query Language. It is combined with model.filter. See the userQuery operator for how to combine with YQL. Can also be used without YQL.

model.restrict restrict String

A comma-delimited list of document type (schema) names, defaulting to all schemas if not set. See multiple schemas and federation.

Use model.sources to restrict to content cluster names or other source names.

model.searchPath path String

Specification of which content nodes a query should be sent to. This is useful for debugging/monitoring and when using Rank phase statistics. Note that in a content cluster with flat distribution (i.e. no <group> element in services.xml), there is 1 implicit group.

If not set, defaults to all nodes in one group, selected by load balancing.

searchpath::ELEMENT [';' ELEMENT]*

ELEMENT::NODE ['/' GROUP]

NODE::EXP [',' EXP]*

EXP::NUM | RANGE

GROUP::NUM

RANGE::'['NUM ',' NUM ' >'

Examples:

  • 7/3 = node 7, group 3.
  • 7/ = node 7, any group.
  • 7,1,9/0 = nodes 1,7 and 9, group 0.
  • 1,[3,9>/0 = nodes 1,3,4,5,6,7,8, group 0.
model.sources search, sources String

A comma-separated list of content cluster names or other source names, defaulting to all sources/clusters if not set. The names of the sources to query, e.g. one or more content clusters and/or federated sources - see content cluster mapping. Also see federation.

Use model.restrict to restrict to schemas.

model.type type String weakAnd

Selects the query language syntax of the model.queryString parameter: all, weakAnd, any, phrase, tokenize, web, yql - refer to simple query language reference. Also see YQL grammar.

Ranking

Parameter Alias Type Default Description
ranking.location String

See Geo search. Point (two-dimensional location) to use as base for location ranking.

ranking.features
.featurename
input
.featurename, rankfeature
.featurename
String

Set a query rank feature input to a value. The key must be a query feature - query(anyname), and the value must be a double, string (to be hashed to a double), or a tensor matching the declared input type on tensor literal form - see the tensor user guide. Examples:

input.query(userageDouble)=42.1

input.query(stringToBeHashed)=abcd

input.query(myIndexedTensor)=[1.0, 2.0, 3.0]

input.query(myMappedTensor)={"Tablet Keyboard Cases": 0.8, "Keyboards":0.3}

ranking.listFeatures rankfeatures Boolean false

Set to true to request all rank-features to be calculated and returned. The rank features will be returned in the summary field rankfeatures. This option is typically used for MLR training, should not to be used for production.

ranking.profile ranking String default

Sets rank profile to use for assigning rank scores for documents. The default rank profile will be used for backends which does not have the given rank profile.

ranking.properties
.propertyname
rankproperty
.propertyname
String

Set a rank property that is passed to, and used by a feature executor for this query. Example: query=foo&ranking.properties.dotProduct.X={a:1,b:2}

ranking.softtimeout
.enable
Boolean true

By default, the hits available are returned on timeout. To return no hits at timeout instead, set ranking.softtimeout.enable=false.

Softtimeout uses ranking.softtimeout.factor of the timeout, default 70%. The rest of the time budget is spent on later ranking phases.

The factor is adaptive, per rank profile - the factor is adjusted based on remaining time after all ranking phases, unless overridden in the query using ranking.softtimeout.factor.

A timeout element is returned in the query response at timeout.

Example: query with 500ms timeout, use 300ms in first-phase ranking: &ranking.softtimeout.enable=true
&ranking.softtimeout.factor=0.6
&timeout=0.5

The ranking.softtimeout settings controls what the content nodes should do in the case where the latency budget has almost been used (timeout times a factor). Return the documents recalled and ranked with the first phase function within the time used, or simply don't produce a result:

  • With soft timeout disabled, the Vespa container will return a 504 timeout without any results.
  • When enabled, it will return the documents matched and ranked up until the timeout was reached, with a 200 OK response along with the reason the result set was degraded.

The container might respond with a timeout error with HTTP response code 504 even with soft timeout enabled if the timeout is set so low that the query does not make it to the content nodes, or the container does not have any time left after input and query processing to dispatch the query to the content nodes.

Read more about soft timeout in coverage degradation.

ranking.softtimeout
.factor
Number 0.7

See ranking.softtimeout.enable.

ranking.sorting sorting String

A valid sort specification. Fields you want to sort on must be stored as document attributes in the index structure by adding attribute to the indexing statement.

ranking.freshness String

Sets the time which will be used as now during execution.

[integer], an absolute time in seconds since epoch, or now-[number], to use a time [integer] seconds into the past, or now to use the current time.

ranking.queryCache Boolean false

Turns query cache on or off. Query is a two-phase process. If the query cache is on, the query is stored on the content nodes between the first and second phase, saving network bandwidth and also query setup time, at the expense of using more memory. It only affects the protocol phase two, see caches in Vespa. It does not cache the result, it just saves resources by not forwarding the query twice (one for the first protocol phase which is find the best k documents from all nodes, to the second phase which is to fill summary data and potentially ranking features listed in summary-features in the rank profile).

The summary-features are re-calculated but this setting avoids sending the query down once more. There is little downside of using it, and it can save resources and latency in cases where the query tree and query ranking features (e.g. tensors used in ranking) are large. As this is a protocol optimization, it also works with changing filter, it's not cached cross independent queries, it's just saving having to send the same query twice.

ranking.rerankCount Number

Specifies the number of hits that should be ranked in the second ranking phase. Overrides the rerank-count set in the rank profile.

ranking.keepRankCount Number

Specifies the number of hits that should keep rank value. Overrides the keep-rank-count set in the rank profile.

ranking.rankScoreDropLimit Number

Minimum rankscore for a document to be considered a hit. Overrides the rank-score-drop-limit set in the rank profile.

ranking.globalPhase.rerankCount Number

Specifies the number of hits that should be re-ranked in the global ranking phase. Overrides the rerank-count set in the rank profile.

ranking.matching

Settings to control behavior during matching of query evaluation. If these are set in the query, they will override any equivalent settings in the rank profile. Detailed descriptions are found in the rank profile documentation.

Parameter Alias Type Default Description
ranking.matching
.numThreadsPerSearch
integer

Rank profile equivalent: num-threads-per-search

Overrides the global persearch threads to a lower value.

ranking.matching
.minHitsPerThread
integer

Rank profile equivalent: min-hits-per-thread

After estimating the number of hits for a query, this number is used to decide how many search threads to use.

ranking.matching
.numSearchPartitions
integer

Rank profile equivalent: num-search-partitions

Number of logical partitions the corpus on a content node is divided in. A partition is the smallest unit a search thread will handle.

ranking.matching
.termwiseLimit
double [0.0, 1.0]

Rank profile equivalent: termwise-limit

If estimated number of hits > corpus * termwise-limit, document candidates are pruned with a TAAT evaluation for query terms not needed for ranking.

ranking.matching
.postFilterThreshold
double [0.0, 1.0]

Rank profile equivalent: post-filter-threshold

Threshold value deciding if a query with an approximate nearestNeighbor operator combined with filters is evaluated using post-filtering.

ranking.matching
.approximateThreshold
double [0.0, 1.0]

Rank profile equivalent: approximate-threshold

Threshold value deciding if a query with an approximate nearestNeighbor operator combined with filters is evaluated by searching for approximate or exact nearest neighbors.

ranking.matching
.targetHitsMaxAdjustmentFactor
double [1.0, inf]

Rank profile equivalent: target-hits-max-adjustment-factor

Value used to control the auto-adjustment of targetHits used when evaluating an approximate nearestNeighbor operator with post-filtering.

ranking.matchPhase

Settings to control behavior during the match phase of query evaluation. If these are set in the query, they will override any match-phase settings in the rank profile. Detailed descriptions are found in the rank profile documentation.

Parameter Alias Type Default Description
ranking.matchPhase
.attribute
string

Rank profile equivalent: match-phase: attribute

The attribute used to limit matches by if more than maxHits hits will be produced.

ranking.matchPhase
.maxHits
long

Rank profile equivalent: match-phase: max-hits

The max number of hits that should be generated on each content node during the match phase.

ranking.matchPhase
.ascending
boolean

Rank profile equivalent: match-phase: order

Whether to keep the documents having the highest (false) or lowest (true) values of the match phase attribute.

ranking.matchPhase
.diversity.attribute
string

Rank profile equivalent: diversity: attribute

The attribute to use when deciding diversity.

ranking.matchPhase
.diversity.minGroups
long

Rank profile equivalent: diversity: min-groups

The minimum number of groups that should be returned from the match phase grouped by the diversity attribute.

Dispatch

Parameter Alias Type Default Description
dispatch.topKProbability double

Probability to use when computing how many hits to fetch from each partition when merging and creating the final result set. See services for details.

Default: none.

Presentation

Parameter Alias Type Default Description
presentation.bolding bolding Boolean true

Whether or not to bold query terms in schema fields defined with bolding: on or summary: dynamic.

presentation.format format String default

Value Description
No value or default The default, builtin JSON format
json Builtin JSON format
xml Builtin XML format.
page XML format which is suitable for use with page templates.
Any other value A custom result renderer supplied by the application
presentation.summary summary String

The name of the summary class used to select fields in results.

Default: The default summary class of the schema.

presentation.template String

The id of a deployed page template to use for this result. This should be used with the page result format.

presentation.timing Boolean false

Whether a result renderer should try to add optional timing information to the rendered page - see the result reference.

presentation.format.tensors String short

Controls how tensors are rendered in the result.

Value Description
short Render the tensor value in an object having two keys, "type" containing the value, and "cells"/"blocks"/"values" (depending on the type) containing the tensor content.
Render the tensor content in the type-appropriate short form.
long Render the tensor value in an object having two keys, "type" containing the value, and "cells" containing the tensor content.
Render the tensor content in the general verbose form.
short-value Render the tensor content directly.
Render the tensor content in the type-appropriate short form.
long-value Render the tensor content directly.
Render the tensor content in the general verbose form.

Grouping and Aggregation

Parameter Alias Type Default Description
select String

Requests specific multi-level result set statistics and/or hit groups to be returned in the result. Fields you want to retrieve statistics or hit groups for must be stored as document attributes in the index structure by adding attribute to the indexing statement.

Default is no grouping.

See the grouping guide for examples.

collapsefield String

Comma-separated list of document summary field names - collapse (i.e. aggregate) results using the fields one after another. Collapsing is run in the container, not content node level. Define one or more collapsefields to remove duplicates if the corpus has few duplicates - this is more efficient than using grouping. Otherwise, use grouping.

Default is no field collapsing.

collapsesize Number 1

The number of hits to keep in each collapsed bucket - used for all collapsefields.

collapsesize.fieldname Number 1

The number of hits to keep in each collapsed bucket - used for the specified field. This value takes precedence over the value specified in collapsesize.

collapse.summary String

A valid name of a document summary class. Use this summary class to fetch the fields used for collapsing.

Default: Use default summary or attributes.

grouping.defaultMaxGroups Number 10

Positive integer or -1 to disable.

The default number of groups to return when max is not specified.

grouping.defaultMaxHits Number 10

Positive integer or -1 to disable.

The default number of hits to return when max is not specified.

grouping.globalMaxGroups Number 10000

Positive integer or -1 to disable.

A cost limit for grouping queries. Any query that may exceed this threshold will be preemptively failed by the container. The limit is defined as the total number of groups and document summaries a query may produce. A query that does not have an implicit or explicit max defined for all levels will always fail if limit is enabled. This parameter can only be overridden in a query profile.

See the grouping guide for practical examples.

grouping.defaultPrecisionFactor Decimal
number
2.0

The default precision scale factor when precision is not specified. The final precision value is calculated by multiplying the effective max value with the scale factor.

Streaming

Parameters for streaming search mode.

Parameter Alias Type Default Description
streaming.groupname A string

Sets the group (specified by g=<groupname>) of the documents to stream through.

streaming.selection A document selection

Restricts streaming search using a selection expression instead of a group id.

If the selection is on the form id.group == "foo" or id.group == "bar" or id.group == ... this will only stream documents in those groups, which is efficient for a small number of groups.

If any other selection is used, this will stream through all groups, which is very costly.

streaming.maxbucketspervisitor An integer Positive infinity

If set, limit backend bucket concurrency to the specified number of buckets. Can be used to explicitly control resource usage for extremely large streaming search locations. This is an expert option.

Tracing

Parameters controlling trace information returning with the result for diagnostics.

Parameter Alias Type Default Description
trace.level tracelevel Number 0

A positive number. Default is no tracing.

Collect trace information for debugging when running a query. Higher numbers give progressively more detail on query transformations, searcher execution and content node(s) query execution. See query tracing for details and examples.

Tracing is subject to change at any time, the below is a guide:

LevelDescription
1Basic tracing in container
2Basic tracing, more details
3Basic tracing, even more details
4Include timing info from content nodes
5Even more timing info from content nodes
6Include the query execution plan (blueprint)
7Include the query execution tree
trace.explainLevel explainlevel Number 0

Set to a positive number to collect query execution information for debugging when running a query. Higher numbers give progressively more detail on content node query execution. Tuning this parameter is useful if we want to get more information from the content nodes without gathering lots of trace information from the container chain.

Explanation is subject to change at any time, the below is a guide:

LevelDescription
1Timing and overall query plan (blueprint) from each content node
2Timing per search thread and execution tree (search iterator tree)

Note that you might get the same at trace.level 5 and above. Default is no explanation.

Tracing with trace.explainLevel also requires that trace.level is positive.

trace.profileDepth Number 0

Turns on performance profiling of the content node query execution for matching, first-phase ranking, and second-phase ranking. How profiling is performed is based on whether trace.profileDepth is positive or negative:

TypeDescription
Tree A positive number specifies the depth used by a tree profiler. A higher number means more profiler data. The output resembles the structure of the search iterator tree or rank expression tree being profiled, with total time and self time tracked per component (node in the tree).
FlatA negative number specifies the topn (cut-off) used by a flat profiler. The output returns the topn components that use the most self time.

The performance profiling output is subject to change at any time. Default is no information.

Tracing with trace.profileDepth also requires that trace.level is positive.

trace.profiling.matching.depth Number 0

Turns on profiling of matching of the content node query execution. This exposes information about how time spent on matching is distributed between individual search iterators. The profiling output is tagged match_profiling and is subject to change at any time. Default is no information. See trace.profileDepth for semantics of this parameter.

Tracing with trace.profiling.matching.depth requires that trace.level is positive.

trace.profiling.firstPhaseRanking.depth Number 0

Turns on profiling of the first-phase ranking of the content node query execution. This exposes information about how time spent on first-phase ranking is distributed between individual rank features. The profiling output is tagged first_phase_profiling and is subject to change at any time. Default is no information. See trace.profileDepth for semantics of this parameter.

Tracing with trace.profiling.firstPhaseRanking.depth also requires that trace.level is positive.

trace.profiling.secondPhaseRanking.depth Number 0

Turns on profiling of the second-phase ranking of the content node query execution. This exposes information about how time spent on second-phase ranking is distributed between individual rank features. The profiling output is tagged second_phase_profiling and is subject to change at any time. Default is no information. See trace.profileDepth for semantics of this parameter.

Tracing with trace.profiling.secondhaseRanking.depth also requires that trace.level is positive.

trace.timestamps Boolean false

Enable to get timing information already at trace.level=1. This is useful for debugging latency spent at different components in the container search chain without rendering a lot of string data which is associated with higher trace levels.

trace.query Boolean true

Whether to include the query in any trace messages. This is useful for avoiding query serialization with very large queries to avoid impact from it on performance and excessively large traces.

Semantic Rules

Refer to semantic rules.

Parameter Alias Type Default Description
rules.off Boolean true

Turn rule evaluation off for this query.

rules.rulebase String

A rule base name - the name of the rule base to use for these queries.

tracelevel.rules Number

The amount of rule evaluation trace output to show, higher number means more details. This is useful to see a trace from rule evaluation without having to see trace from all other searchers at the same time.

Other

Parameter Alias Type Default Description
recall String

Any allowed collection of recall terms. Sets a recall parameter to be combined with the query. This is identical to filter, except that recall terms are not exposed to the ranking framework and thus not ranked. As such, one can not use unprefixed terms; they must either be positive or negative.

user String

The id of the user making the query. The content of the argument is made available to the search chain, but it triggers no features in Vespa apart from being propagated to the access log.

hitcountestimate Boolean false

Make this an estimation query. No hits will be returned, and total hit count will be set to an estimate of what executing the query as a normal query would give.

metrics.ignore Boolean false

Ignore metric collection for this query request, useful for warm-up queries.

weakAnd.replace Boolean false

Replace all instances of OR in the query tree with weakAnd.

wand.hits Number 100

Used in combination with weakAnd.replace. Sets the targetHits of the new weakAnds to the specified value.

sorting.degrading Boolean true

When sorting on a single-value numeric attribute with fast-search an optimization is activated to return early, with an inaccurate total-hits count. Set sorting.degrading to false to disable this optimization.

This optimization sets the primary sorting attribute as the match phase attribute, and match phase maxHits equal to max(10000, maxHits+maxOffset). maxHits and maxOffset can be set in a query profile.

noCache nocache Boolean false

Sets whether this query should never be served from a cache. Vespa has few caches, and this parameter does not control any of them. Therefore, this parameter has no effect

HTTP status codes

The status code rules are:

  • If the Result contains no errors (Result.hits().getError()==null): 200 OK is returned.
  • If the Result contains errors and no regular hits:
    • If the error code of any ErrorMessage in the Result (Result.hits().getErrorHit().errorIterator()) is a "WEB SERVICE ERROR CODE", the first of those is returned.
    • Otherwise, if it is an "HTTP COMPATIBLE ERROR CODE", the mapping of it is returned.
    • Otherwise 500 INTERNAL_SERVER_ERROR is returned.
  • If the Result contains errors and also contains valid hits: The same as above, but 200 OK is returned by default instead of 500.

WEB SERVICE ERROR CODES:

200, 301, 302, 307, 400, 401, 403, 404, 405, 406, 408, 428, 429, 431, 500, 501, 502, 511

HTTP COMPATIBLE ERROR CODES:

com.yahoo.container.protect.Error.BAD_REQUEST -> Http code 400
com.yahoo.container.protect.Error.UNAUTHORIZED -> Http code 401
com.yahoo.container.protect.Error.FORBIDDEN -> Http code 403
com.yahoo.container.protect.Error.NOT_FOUND -> Http code 404
com.yahoo.container.protect.Error.INTERNAL_SERVER_ERROR -> Http code 500
com.yahoo.container.protect.Error.INSUFFICIENT_STORAGE -> Http code 507

select

A select query is equivalent in structure to YQL, written in JSON. Contains subparameters where and grouping.

Parameter Alias Type Default Description
where String

A string with JSON. Refer to the select reference for details.

grouping String

A string with JSON. Refer to the select reference for details.