Vespa Search API reference

All the search request parameters listed below can be set in query profiles. The first four blocks of properties are also modeled as query profile types. These types can be referred from query profiles (and inheriting types) to provide type checking on the parameters.

These parameters often have both a full name - which includes the path from the root query profile - and one or more abbreviated names. Both names can be used in search requests, while only full names can be used in query profiles. The full names are case sensitive, while the abbreviated names are case insensitive.

The parameters modeled as query profiles are also available through get methods as Java objects from the Query to Searcher components.

Index

Query
Native Execution Parameters
Query Model Parameters
Ranking
Presentation
Grouping
Geographical Searches
Streaming Search
Semantic Rules
Other

Query

yql

Alias
ValuesString
DefaultNone

The YQL query will be parsed and executed in the backend. Only simple YQL programs are supported, refer to YQL for details.

Native Execution Parameters

These parameters are defined in the native query profile type.

hits

Aliascount
Values A positive integer, or 0. The sum of offset and hits should be lower than the configured maxoffset value, and will be adjusted to fit. See also comment at offset.
Default10

The maximum number of hits to return from the result set. Must be lower than maxHits, which is either set in a query profile, or default 400.

offset

Aliasstart
Values A positive integer, including 0.
Default0

The index of the first hit to return from the result set. Must be lower than maxOffset, which is either set in a query profile, or default 1000.

queryProfile

AliasNone
Values A query profile id - name:version, where version can be omitted or partially specified, e.g "myprofile:2.1"
Defaultdefault

A query profile has default properties for a query. The default query profile is named default - example:

<query-profile id="default">
  <field name="maxHits">10</field>
  <field name="maxOffset">1000</field>
</query-profile>

nocache

Alias
Values True or false
Defaultfalse

Set to true to avoid the result being fetched from cache, and avoid writing the result to cache after fetching it.

groupingSessionCache

Alias
Values True or false
Defaultfalse

Set to true to store intermediate grouping results in the search back ends when using multi level grouping expressions in order to speed up grouping at a potential loss of accuracy. See the grouping reference for more details.

searchChain

Alias
Values A search chain id - name:version, where version can be omitted or partially specified, e.g "mychain:2.1.3".
Defaultdefault

The search chain initially invoked when processing this query. This search chain may invoke other chains.

timeout

Alias
Values Positive floating point number with an optional unit. Default unit is seconds (s), valid unit strings are e.g. ms and s. To set a timeout of one minute, the argument could be set to 60 s. Space between the number and the unit is optional.
DefaultUndefined, but guaranteed to be at least 5000 milliseconds. This default can be overridden by configuring timeout in a query profile.

The query timeout.

tracelevel

Alias
Values Any positive number
DefaultNo tracing

Set to a positive number to collect trace information for debugging when running a query. Higher numbers give progressively more detail on query transformations and searcher execution.

trace.timestamps

Alias
Values true or false
DefaultNo timestamps in trace

Enable it to get timing information already at tracelevel=1.

Query Model Parameters

model.defaultIndex [def-idx, default-index]

Aliasdef-idx, default-index
ValuesAn index name
Defaultdefault

The field which is searched for query terms which doesn't explicitly specify an index.

model.encoding [encoding]

Aliasencoding
ValuesEncoding names or aliases defined in the IANA character sets
Defaultutf-8

Sets the encoding to use when returning a result. The encodings big5, euc-jp, euc-kr, gb2312, iso-2022-jp and shift-jis also influences how tokenization is done in the absence of an explicit language setting.

The query is always encoded as UTF-8, independently of how the result will be encoded.

model.filter [filter]

Aliasfilter
ValuesAny allowed collection of filter terms
DefaultNot set

Sets a filter to be combined with the query. Typical use of a filter is to add machine generated or preferences based filter terms to a raw user query. The filter is parsed the same way as a query of type any, the full syntax is available. The positive terms (preceded by +) and phrases act as AND filters, the negative terms (preceded by -) act as NOT filters, while the unprefixed terms will be used to RANK the results. Unless the query has no positive terms, the filter will only restrict and influence ranking of the result set, never cause more matches than the query.

model.language [lang, language]

Aliaslanguage, lang
ValuesRef. RFC 3066
DefaultUnspecified

Informs Vespa about the natural language of the query. Please see linguistics for details. This attribute should always be set when it is known. If this parameter is not set, it will be guessed from the query and encoding, and default to english if it cannot be guessed.

model.queryString [query]

Aliasquery
ValuesAny HTTP encoded legal Vespa query language string
DefaultNot set

The Simple Vespa Query Language query string specifying which documents to match in this query.

model.restrict [restrict]

Aliasrestrict
ValuesA comma delimited list of document type names.
DefaultSearch unrestricted

The document types to restrict the search to when different document types share the same search cluster.

model.searchPath [path]

Aliassearchpath
Values
  • searchpath::ELEMENT [';' ELEMENT]*
  • ELEMENT::PART ['/' ROW]
  • PART::EXP [',' EXP]*
  • EXP::NUM | RANGE
  • ROW::NUM
  • RANGE::'['NUM ',' NUM ' >'
DefaultWhole cluster

Specification of which path to send the query to. Used to select which set of search nodes in the cluster should be used. Only meant for debugging/monitoring.

Examples: Note that in an indexed content cluster with flat distribution we have 1 implicit row and each search node represents a part.

  • '7/3' = part 7, row 3.
  • '7/' = part 7, any row.
  • '7,1,9/0' = parts 1,7 and 9, row 0.
  • '1,[3,9>/0' = parts 1,3,4,5,6,7,8, row 0.

In a cluster with a multi-level dispatch setup we must specify a search path element for each level. Lets say we have a setup with 2 mid-level dispatch groups, each containing 3 search nodes (and 3 dispatchers):

  • '0/;2/' = dispatch group (part) 0, any of the dispatchers (row); search node (part) 2, any row (of 1 present).
  • '0/1;2/0' = dispatch group (part) 0, dispatcher (row) 1; search node (part) 2, row 0 (of 1 present).

model.sources [search, sources]

Aliassearch, sources
ValuesA comma separated list of search cluster names or other source names
DefaultSearch unrestricted

The names of the sources to search, e.g one or more search clusters and/or federated sources.

model.type [type]

Aliastype
Valuesweb, all, any, phrase, yql, adv (deprecated) - refer to simple query language reference
Defaultall

Selects the query language syntax of the query parameter.

Ranking

ranking.location [location]

Aliaslocation
ValuesSee Geo search
DefaultNone

Point (one or two dimensional) location to use as base for location ranking. For geographical locations, it is recommended to add the location using pos.ll

ranking.features.featurename [rankfeature.featurename]

Aliasrankfeature.featurename
ValuesAny string
DefaultNone

Set a rank feature to a value. This works for any key name query(anyname) (query features), and also as a way to override all existing (match and document) features. Example: query=foo&ranking.features.query(userage)=42&ranking.features.fieldMatch(title)=0.65

ranking.listFeatures [rankfeatures]

Aliasrankfeatures
Valuesboolean
Defaultfalse

Set to true to request all rank features to be calculated and returned. The rank features will be returned in the summary field rankfeatures. This option is typically used for MLR training, should not to be used for production.

ranking.profile [ranking]

Aliasranking
ValuesAny rank profile name
Defaultdefault

Sets the name of the rank profile to use for assigning relevancy scores. The default rank profile will be used for back-ends which does not have the given rank profile.

ranking.properties.propertyname [rankproperty.propertyname]

Aliasrankproperty.propertyname
ValuesAny string
DefaultNone

Set a rank property that is passed to, and used by a feature executor for this query. Example: query=foo&ranking.properties.dotProduct.X={a:1,b:2}

ranking.sorting [sorting]

Aliassorting
ValuesA valid sort specification
DefaultNone - order by relevance

A specification of how to sort the result. Fields you want to sort on must be stored as document attributes in the index structure by adding attribute to the indexing statement.

ranking.freshness

Alias
Values[integer], an absolute time in seconds since epoch, or now-[number], to use a time [integer] seconds into the past, or now to use the current time
DefaultNone - use the current time on each node.

Sets the time which will be used as now during execution.

ranking.queryCache

Alias
Valuesboolean
Defaultfalse

Turns query cache on or off. Search is a two-phase process. If the query cache is on, the query is stored on the search nodes between the first and second phase, saving network bandwidth and also query setup time, at the expense of using more memory.

ranking.matchPhase

Settings which control Vespa's behavior during the match phase. If these are set in the query they will override any match-phase setting in the rank profile.

ranking.matchPhase.maxHits

Alias
Valueslong
DefaultIf sorting and not ranking: max(10000, maxhits+maxoffset). Otherwise: none.

The max hits the engine should attempt to produce in the match phase on each partition. If it is determined during matching that many more hits than this will be generated, the matching will fall back to take the best (highest or lowest) values of the attribute given by ranking.matchPhase.attribute.

By default, this will be turned on only when sorting is used and grouping is not. If sorting is used, the primary sort attribute will be used as the match phase attribute if it has fast-search set. In that case the default can be overridden by setting this value explicitly.

ranking.matchPhase.attribute

Alias
ValuesAn attribute name
Defaultnone

The attribute to decide which documents are a match if the match phase estimates that there will be more than maxHits matches. This attribute should have fast-search set and should correlate with the order which would be produced by a full evaluation.

ranking.matchPhase.ascending

Alias
Valuesboolean
Defaultfalse

Whether the attribute should be sorted in ascending or descending (default) order to determine which documents to keep as matches.

ranking.matchPhase.diversity.attribute

Alias
ValuesAn attribute name
Defaultnone.

The attribute to be used for producing the desired diversity. Also see attribute.

ranking.matchPhase.diversity.minGroups

Alias
Valueslong
Defaultnone

The minimum number of groups that should be returned from the match phase grouped by the diversity attribute. Also see min-groups.

Presentation

presentation.bolding [bolding]

Aliasbolding
Valuesboolean
Defaulttrue

Whether or not to bold search terms in search definition fields defined with bolding: on or summary: dynamic.

presentation.format [format]

Aliasformat
Values
No value or default The default, builtin JSON format
json Builtin JSON format
xml Deprecated, builtin XML format
page Alternative deprecated XML format which is suitable for use with page templates.
Any other value A custom result renderer supplied by the application
Defaultdefault

presentation.summary [summary]

Aliassummary
Values The name of the summary class used to select fields in results.
DefaultThe default summary class of the search definition.

presentation.template

Alias
ValuesAny id specification of a deployed page template.
Default

The id of the page template to use for this result. This should be used with the page result format.

presentation.timing

Alias
Valuesboolean
Defaultfalse

Whether a result renderer should try to add optional timing information to the rendered page.

Grouping and Aggregation

select

Alias
ValuesA valid grouping specification.
DefaultNo grouping

Requests specific multi-level result set statistics and/or hit groups to be returned in the result. Fields you want to retrieve statistics or hit groups for must be stored as document attributes in the index structure by adding attribute to the indexing statement. See the grouping guide.

collapsefield

Alias
ValuesAny document summary field name
DefaultNo field collapsing

Collapse (i.e. aggregate) results using this field. Collapsing is run in the container, not content node level. Define a collapsefield to remove duplicates if the corpus has few duplicates - this is more efficient than using grouping. Otherwise, use grouping.

collapsesize

Alias
ValuesA positive integer
Default1

The number of hits to keep in each collapsed bucket

collapse.summary

Alias
ValuesA valid name of a document summary class.
DefaultUse default summary or attributes.

Use this summary class to fetch the field used for collapsing.

Geographical Searches

pos.ll

Alias
Values Position given in latitude and longitude - example: S22.4532;W123.9887 Refer to position field for format specification.
DefaultNone

pos.radius

Alias
Values Radius of the circle used for filtering. Valid units of measurement are km, m and mi. Examples:
  • pos.radius=100m
  • pos.radius=42mi
  • pos.radius=4km
One can also specify just a number (internal units, micro-degrees), but this is not recommended.
Default50km

pos.bb

Alias
Values Bounding box for positions, given as latitude and longitude boundaries. The four boundaries must be specified as N, S, E, W, with degrees as a decimal fraction. Degrees south of equator or west of Greenwich are input as negative numbers. Examples:
  • n=37.44899,s=37.3323,e=-121.98241,w=-122.06566
  • s=40.183868,w=-74.819519,n=40.248291,e=-74.728798
DefaultNone

pos.attribute

Alias
ValuesAny attribute that has zcurve encoded positions as a long attribute.
DefaultRandom choice among the ones declared as position in the searchdefinition.

Which attribute to use for the position. Can be both single- or multi-value.

Streaming Search

The features in this section applies to streaming search only.

streaming.userid

Alias
ValuesAn integer in decimal notation in the range [0, 2^64>
DefaultNone

Restricts streaming search to only stream through documents with document ids having the n=<number> modifier and the userid part matches the supplied value. This can be used for grouping documents on a 64 bit integer.

streaming.groupname

Alias
ValuesA string
DefaultNone

Restricts streaming search to only stream through documents with document ids having the g=<groupname> modifier and the groupname part matches the supplied value. This can be used for grouping documents on a string.

streaming.selection

Alias
ValuesA string
DefaultNone

Restricts streaming search using a document selection. This can be used for selecting a subset of documents based on an advanced expression.

streaming.priority

Alias
ValuesPriority class
DefaultVERY_HIGH

Priority of the streaming search visitor. Having a high priority visitor helps maintain low latencies even when the system is under load.

streaming.maxbucketspervisitor

Alias
Valuesint
Default1 (if ordering is set), or infinite

If set, visit only this many buckets at a time. Combine with ordering to reduce visiting time for large users/groups.

Semantic Rules

Refer to semantic rules.

rules.off

Alias
ValuesBoolean
DefaultTrue

Turn rule evaluation off for this query

rules.rulebase

Alias
ValuesString
DefaultA rule base name

The name of the rule base to use for these queries

tracelevel.rules

Alias
Valuesint
Default1-5 (?)

The amount of rule evaluation trace output to show, higher number means more details. This is useful to see a trace from rule evaluation without having to see trace from all other searchers at the same time.

Other

recall

Alias
ValuesAny allowed collection of recall terms
DefaultNo recall

Sets a recall parameter to be combined with the query. This is identical to filter, except that recall terms are not exposed to the ranking framework and thus not ranked. As such, one can not use unprefixed terms; they must either by positive or negative.

user

Alias
ValuesA string
DefaultNone

The id of the user making the query. The contents of the argument are made available to the search chain, but it triggers no features in Vespa apart from being propagated to the access log.

nocachewrite

Alias
ValuesBoolean
DefaultFalse

Set to true to avoid the result being written to cache when fetched.

hitcountestimate

Alias
ValuesBoolean
DefaultFalse

Make this an estimation query. No hits will be returned, and total hit count will be set to an estimate of what executing the query as a normal query would give.