Query API Reference
All the request parameters listed below can be set in query profiles. The first four blocks of properties are also modeled as query profile types. These types can be referred from query profiles (and inheriting types) to provide type checking on the parameters.
These parameters often have both a full name - including the path from the root query profile - and one or more abbreviated names. Both names can be used in requests, while only full names can be used in query profiles. The full names are case sensitive, abbreviated names are case insensitive.
The parameters modeled as query profiles are also available through get methods as Java objects from the Query to Searcher components.
Parameters
- Native Execution Parameters
- Query Model Parameters
-
- model.defaultIndex [default-index]
- model.encoding [encoding]
- model.filter [filter]
- model.locale [locale]
- model.language [lang, language]
- model.queryString [query]
- model.restrict [restrict]
- model.searchPath [path]
- model.sources [search, sources]
- model.type [type]
- Ranking
-
- ranking.location [location]
- ranking.features [rankfeature]
- ranking.listFeatures [rankfeatures]
- ranking.profile [ranking]
- ranking.properties [rankproperty]
- ranking.softtimeout
- ranking.sorting [sorting]
- ranking.freshness
- ranking.queryCache
- ranking.rerankCount
- ranking.matchPhase
- Presentation
-
- presentation.bolding [bolding]
- presentation.format [format]
- presentation.template
- presentation.summary [summary]
- presentation.timing
- Grouping
- Geographical Searches
- Streaming Search
- Semantic Rules
- Dispatch
- Other
HTTP status codes
The status code rules are:
- If the Result contains no errors (Result.hits().getError()==null): 200 OK is returned.
-
If the Result contains errors and no regular hits:
- If the error code of any ErrorMessage in the Result (Result.hits().getErrorHit().errorIterator()) is a "WEB SERVICE ERROR CODE", the first of those is returned.
- Otherwise, if it is a "HTTP COMPATIBLE ERROR CODE", the mapping of it is returned.
- Otherwise 500 INTERNAL_SERVER_ERROR is returned.
- If the Result contains errors and also contains valid hits: The same as above, but 200 OK is returned by default instead of 500.
200, 301, 302, 307, 400, 401, 403, 404, 405, 406, 408, 428, 429, 431, 500, 501, 502, 511HTTP COMPATIBLE ERROR CODES:
com.yahoo.container.protect.Error.BAD_REQUEST -> Http code 400 com.yahoo.container.protect.Error.UNAUTHORIZED -> Http code 401 com.yahoo.container.protect.Error.FORBIDDEN -> Http code 403 com.yahoo.container.protect.Error.NOT_FOUND -> Http code 404 com.yahoo.container.protect.Error.INTERNAL_SERVER_ERROR -> Http code 500 com.yahoo.container.protect.Error.INSUFFICIENT_STORAGE -> Http code 507
Query
yql
Alias | |
Values | String. The YQL query to be parsed and executed. |
Default | None |
select
Select query is equivalent to YQL, written in JSON.
Contains subparameters where
and grouping
.
where
Alias | |
Values | JSON |
Default | None |
grouping
Alias | |
Values | JSON |
Default | None |
The where and grouping query will be parsed and executed in the backend. Refer to Select Reference for details.
Native Execution Parameters
These parameters are defined in the native
query profile type.
hits
Alias | count |
Values |
A positive integer, or 0. The sum of offset and
hits should be lower than the configured maxoffset
value, and will be adjusted to fit. See also comment
at offset .
|
Default | 10 |
The maximum number of hits to return from the result set.
Must be lower than maxHits
, which is either set in a
query profile, or default 400.
offset
Alias | start |
Values | A positive integer, including 0. |
Default | 0 |
The index of the first hit to return from the result set.
Must be lower than maxOffset
, which is either set in a
query profile, or default 1000.
queryProfile
Alias | None |
Values | A query profile id - name:version, where version can be omitted or partially specified, e.g "myprofile:2.1" |
Default | default |
A query profile has default properties for a query. The default query profile is named default - example:
<query-profile id="default"> <field name="maxHits">10</field> <field name="maxOffset">1000</field> </query-profile>
groupingSessionCache
Alias | |
Values | True or false |
Default | true |
Set to true to store intermediate grouping results in the content nodes when using multi level grouping expressions, in order to speed up grouping at a potential loss of accuracy. This causes the query and grouping expression to be run only once.
When having multi-level grouping expressions, the search query is normally re-run for each level. The drawback of this is, with an expensive ranking function, the query will take more time than strictly necessary.
Note: The flag is only useful if the grouping expression
does not have a order()
clause.
The drawback of using this flag is that when max()
is specified in the grouping expression,
it might cause inaccuracies in aggregated values such as count()
.
It is hence recommended to test whether or not this is an issue,
and if so, adjust the precision
parameter to still get correct counts.
searchChain
Alias | |
Values | A search chain id - name:version, where version can be omitted or partially specified, e.g "mychain:2.1.3". |
Default | default |
The search chain initially invoked when processing the query. This search chain may invoke other chains.
timeout
Alias | |
Values | Positive floating point number with an optional unit. Default unit is seconds (s), valid unit strings are e.g. ms and s. To set a timeout of one minute, the argument could be set to 60 s. Space between the number and the unit is optional. |
Default | 500 milliseconds. |
The query timeout, returning 0 hits in the result set. To return hits at timeout, refer to ranking.softtimeout
tracelevel
Alias | |||||||||||||||
Values |
Any positive number. Tracing is subject to change at any time, the below is a guide:
|
||||||||||||||
Default | No tracing |
Set to a positive number to collect trace information for debugging when running a query. Higher numbers give progressively more detail on query transformations, searcher execution and search backend execution.
explainlevel
Alias | |||||
Values |
Any positive number. Note that you might get the same at tracelevel 5 and above. Explanation is subject to change at any time, the below is a guide:
|
||||
Default | No explanation |
Set to a positive number to collect query execution information for debugging when running a query. Higher numbers give progressively more detail on backend query execution. Must be combined with non-zero tracelevel.
trace.timestamps
Alias | |
Values | true or false |
Default | No timestamps in trace |
Enable it to get timing information already at tracelevel=1 which is useful for debugging latency spent at different components in the search chain without rendering a lot of string data which is associated with higher trace levels.
Query Model Parameters
model.defaultIndex [default-index]
Alias | default-index |
Values | An index name |
Default | default |
The field which is searched for query terms which doesn't explicitly specify an index.
model.encoding [encoding]
Alias | encoding |
Values | Encoding names or aliases defined in the IANA character sets |
Default | utf-8 |
Sets the encoding to use when returning a result. The encodings big5, euc-jp, euc-kr, gb2312, iso-2022-jp and shift-jis also influences how tokenization is done in the absence of an explicit language setting.
The query is always encoded as UTF-8, independently of how the result will be encoded.
model.filter [filter]
Alias | filter |
Values | A filter string in the Simple Query Language |
Default | Not set |
Sets a filter to be combined with the model.queryString.
Typical use of a filter is to add machine generated or preferences based filter terms
to the user query.
The filter is parsed the same way as a query of type any
,
the full syntax is available.
The positive terms (preceded by +) and phrases act as AND filters,
the negative terms (preceded by -) act as NOT filters,
while the unprefixed terms will be used to RANK the results.
Unless the query has no positive terms,
the filter will only restrict and influence ranking of the result set,
never cause more matches than the query.
model.locale [locale]
Alias | locale |
Values | A language tag from RFC 5646 |
Default | Not set |
Sets the locale and language to use when parsing queries from a language tag, such as e.g "en-US". This attribute should always be set when it is known. If this parameter is not set, it will be guessed from the query and encoding, and default to english if it cannot be guessed.
model.language [lang, language]
Alias | language, lang |
Values | A language tag from RFC 5646, but allowing underscore instead of dash as separator character. |
Default | Unspecified |
A legacy alternative to locale. When this value is accessed, underscores will be replaced by dashes in the returned value.
model.queryString [query]
Alias | query |
Values | A query string in the Simple Query Language. It is combined with model.filter. See the userQuery operator for how to combine with YQL. Can also be used without YQL. |
Default | Not set |
model.restrict [restrict]
Alias | restrict |
Values | A comma delimited list of document type names. |
Default | Search unrestricted |
The document types to restrict the query to when different document types share the same content cluster. See Querying multiple document types.
model.searchPath [path]
Alias | searchpath |
Values |
|
Default | All nodes in one group chosen by load balancing. |
Specification of which content nodes a query should be sent to. This is useful for debugging/monitoring.
Examples: Note that in a content cluster with flat distribution we have 1 implicit group
- '7/3' = node 7, group 3.
- '7/' = node 7, any group.
- '7,1,9/0' = nodes 1,7 and 9, group 0.
- '1,[3,9>/0' = nodes 1,3,4,5,6,7,8, group 0.
model.sources [search, sources]
Alias | search, sources |
Values | A comma separated list of search cluster names or other source names |
Default | Search unrestricted |
The names of the sources to search, e.g one or more search clusters and/or federated sources. See Querying multiple document types.
model.type [type]
Alias | type |
Values | web, all, any, phrase, yql, adv (deprecated) - refer to simple query language reference |
Default | all |
Selects the query language syntax of the model.queryString parameter.
Ranking
ranking.location (deprecated)
Alias | location |
Values | See Geo search |
Default | None |
Point (two dimensional location) to use as base for location ranking.
Deprecated in favor of adding a
geoLocation
item to the query tree (inside a
rank
operator if it should be used only for ranking).
ranking.features.featurename [rankfeature.featurename]
Alias | rankfeature.featurename |
Values | Any string |
Default | None |
Set a rank feature to a value. This works for any key name query(anyname)
(query features),
and also as a way to override all existing (match and document) features.
Example: query=foo&ranking.features.query(userage)=42&ranking.features.fieldMatch(title)=0.65
ranking.listFeatures [rankfeatures]
Alias | rankfeatures |
Values | boolean |
Default | false |
Set to true to request all rank-features to be calculated and returned. The rank features will be returned in the summary field rankfeatures. This option is typically used for MLR training, should not to be used for production.
ranking.profile [ranking]
Alias | ranking |
Values | Any rank profile name |
Default | default |
Sets rank profile to use for assigning rank scores for documents. The default rank profile will be used for back-ends which does not have the given rank profile.
ranking.properties.propertyname [rankproperty.propertyname]
Alias | rankproperty.propertyname |
Values | Any string |
Default | None |
Set a rank property that is passed to, and used by a feature executor for this query. Example: query=foo&ranking.properties.dotProduct.X={a:1,b:2}
ranking.softtimeout
By default, the hits available are returned on timeout. To return no hits at timeout instead, set ranking.softtimeout.enable=false. Softtimeout use ranking.softtimeout.factor of the timeout, 70% default. The rest of the time budget is spent on later ranking phases. The factor is adaptive, per rank profile - the factor is adjusted based on remaining time after all ranking phases, unless overridden in the query using ranking.softtimeout.factor. Example: query with 500ms timeout, use 300ms in first-phase ranking:
&ranking.softtimeout.enable=true&ranking.softtimeout.factor=0.6&timeout=0.5Read more about softtimeout in Coverage degredation documentation.
ranking.softtimeout.enable
Alias | |
Values | boolean |
Default | true |
ranking.softtimeout.factor
Alias | |
Values | [0 - 1] |
Default | 0.7 |
ranking.sorting [sorting]
Alias | sorting |
Values | A valid sort specification |
Default | None - order by relevance |
A specification of how to sort the result. Fields you want to sort on must be stored as document attributes in the index structure by adding attribute to the indexing statement.
ranking.freshness
Alias | |
Values | [integer] , an absolute time in seconds since epoch, or now-[number] , to use a time [integer] seconds into the past, or now to use the current time |
Default | None - use the current time on each node. |
Sets the time which will be used as now during execution.
ranking.queryCache
Alias | |
Values | boolean |
Default | false |
Turns query cache on or off. Query is a two-phase process. If the query cache is on, the query is stored on the content nodes between the first and second phase, saving network bandwidth and also query setup time, at the expense of using more memory.
ranking.rerankCount
Alias | |
Values | integer |
Default | null |
Specifies the number of hits that should be ranked in the second ranking phase. Overrides the rerank-count set in the rank profile.
ranking.matchPhase
Settings which control Vespa's behavior during the match phase. If these are set in the query, they will override any match-phase setting in the rank profile.
- ranking.matchPhase.maxHits the max number of hits that should be generated during the match phase
- ranking.matchPhase.attribute the attribute to limit matches by if more than maxHits hits will be generated
- ranking.matchPhase.ascending whether to keep the documents having the highest (default) or lowest values of the attribute
- ranking.matchPhase.diversity.attribute the attribute to use to guarantee diversity.
- ranking.matchPhase.diversity.minGroups the minimum number of groups grouped by the diversity attribute.
ranking.matchPhase.maxHits
Alias | |
Values | long |
Default | If sorting and not ranking: max(10000, maxhits+maxoffset). Otherwise: none. |
The max hits the engine should attempt to produce in the match phase on each partition. If it is determined during matching that many more hits than this will be generated, the matching will fall back to take the best (highest or lowest) values of the attribute given by ranking.matchPhase.attribute.
By default, this will be turned on only when sorting is used and grouping is not. If sorting is used, the primary sort attribute will be used as the match phase attribute if it has fast-search set. In that case the default can be overridden by setting this value explicitly.
ranking.matchPhase.attribute
Alias | |
Values | An attribute name |
Default | none |
The attribute to decide which documents are a match if the match phase estimates that there will be more than maxHits matches. This attribute should have fast-search set and should correlate with the order which would be produced by a full evaluation.
ranking.matchPhase.ascending
Alias | |
Values | boolean |
Default | false |
Whether the attribute should be sorted in ascending or descending (default) order to determine which documents to keep as matches.
ranking.matchPhase.diversity.attribute
Alias | |
Values | An attribute name |
Default | none. |
The attribute to be used for producing the desired diversity.
ranking.matchPhase.diversity.minGroups
Alias | |
Values | long |
Default | none |
The minimum number of groups that should be returned from the match phase grouped by the diversity attribute. Also see min-groups.
Dispatch
dispatch.topKProbability
Alias | |
Values | double |
Default | none |
Probability to use when computing how many hits to fetch from each partition when merging and creating the final result set. See services for details.
Presentation
presentation.bolding [bolding]
Alias | bolding |
Values | boolean |
Default | true |
Whether or not to bold query terms in schema fields defined with bolding: on or summary: dynamic.
presentation.format [format]
Alias | format | ||||||||||
Values |
| ||||||||||
Default | default |
presentation.summary [summary]
Alias | summary |
Values | The name of the summary class used to select fields in results. |
Default | The default summary class of the schema. |
presentation.template
Alias | |
Values | Any id specification of a deployed page template. |
Default |
The id of the page template to use for this result. This should be used with the page result format.
presentation.timing
Alias | |
Values | boolean |
Default | false |
Whether a result renderer should try to add optional timing information to the rendered page.
Grouping and Aggregation
select
Alias | |
Values | A valid grouping specification. |
Default | No grouping |
Requests specific multi-level result set statistics and/or hit groups to be returned in the result. Fields you want to retrieve statistics or hit groups for must be stored as document attributes in the index structure by adding attribute to the indexing statement. See the grouping guide.
collapsefield
Alias | |
Values | Any document summary field name |
Default | No field collapsing |
Collapse (i.e. aggregate) results using this field. Collapsing is run in the container, not content node level. Define a collapsefield to remove duplicates if the corpus has few duplicates - this is more efficient than using grouping. Otherwise, use grouping.
collapsesize
Alias | |
Values | A positive integer |
Default | 1 |
The number of hits to keep in each collapsed bucket
collapse.summary
Alias | |
Values | A valid name of a document summary class. |
Default | Use default summary or attributes. |
Use this summary class to fetch the field used for collapsing.
Geographical Searches
When a position is used in a query, a distance to this position is calculated and returned -
see the rendering reference and
geo search for details.
Adding a
geoLocation
item to the query tree is the preferred and uniform way to inject the position,
radius, and attribute field, so the request parameters below are now all deprecated.
pos.ll (deprecated)
Alias | pos.ll |
Values | Position given in latitude and longitude - example: S22.4532;W123.9887 Refer to position field for format specification. |
Default | None |
pos.radius (deprecated)
Alias | pos.radius |
Values |
Radius of the circle used for filtering. Valid units of measurement are km, m and mi. Examples:
|
Default | 50km |
pos.bb (deprecated)
Alias | pos.bb |
Values |
NOTE: To be removed in Vespa 8. Bounding box for positions, given as latitude and longitude boundaries. The four boundaries must be specified as N, S, E, W, with degrees as a decimal fraction. Degrees south of equator or west of Greenwich are negative numbers. S/W/N/E can be in any order, and be specified in lower or upper case, Restrictions: N>=S and E>=W. Examples:
|
Default | None |
pos.attribute (deprecated)
Alias | pos.attribute |
Values | Any attribute that has zcurve encoded positions as a long attribute. |
Default | Random choice among the ones declared as position in the schema. |
Which attribute to use for the position. Can be both single-valued or array.
Streaming Search
The features in this section applies to streaming search only.
streaming.userid
Alias | |
Values | An integer in decimal notation in the range [0, 2^64> |
Default | None |
Restricts streaming search to only stream through documents with document ids having the n=<number> modifier and the userid part matches the supplied value. This can be used for grouping documents on a 64 bit integer.
streaming.groupname
Alias | |
Values | A string |
Default | None |
Restricts streaming search to only stream through documents with document ids having the g=<groupname> modifier and the groupname part matches the supplied value. This can be used for grouping documents on a string.
streaming.selection
Alias | |
Values | A string |
Default | None |
Restricts streaming search using a document selection. This can be used for selecting a subset of documents based on an advanced expression.
streaming.priority
Alias | |
Values | Priority class |
Default | VERY_HIGH |
Priority of the streaming search visitor. Having a high priority visitor helps maintain low latencies, even when the system is under load.
streaming.maxbucketspervisitor
Alias | |
Values | int |
Default | 1 (if ordering is set), or infinite |
If set, visit only this many buckets at a time. Combine with ordering to reduce visiting time for large users/groups.
Semantic Rules
Refer to semantic rules.
rules.off
Alias | |
Values | Boolean |
Default | True |
Turn rule evaluation off for this query
rules.rulebase
Alias | |
Values | String |
Default | A rule base name |
The name of the rule base to use for these queries
tracelevel.rules
Alias | |
Values | int |
Default | 1-5 (?) |
The amount of rule evaluation trace output to show, higher number means more details. This is useful to see a trace from rule evaluation without having to see trace from all other searchers at the same time.
Other
recall
Alias | |
Values | Any allowed collection of recall terms |
Default | No recall |
Sets a recall parameter to be combined with the query. This is identical to filter, except that recall terms are not exposed to the ranking framework and thus not ranked. As such, one can not use unprefixed terms; they must either by positive or negative.
user
Alias | |
Values | A string |
Default | None |
The id of the user making the query. The contents of the argument are made available to the search chain, but it triggers no features in Vespa apart from being propagated to the access log.
hitcountestimate
Alias | |
Values | Boolean |
Default | False |
Make this an estimation query. No hits will be returned, and total hit count will be set to an estimate of what executing the query as a normal query would give.
metrics.ignore
Alias | |
Values | Boolean |
Default | False |
Ignore metric collection for this query request, useful for warm up queries