• [+] expand all

Query API

Use the Vespa Query API to query, rank and organize data. Example:

  yql=select * from sources * where year > 2018;
Simplified, a query has the following components:
  • Input data
  • Ranking and grouping specification
  • Results
  • Other execution parameters
This guide is an introduction to the more important elements in the API - refer to the Query API reference for details. See query execution below for data flow.


Input data is both structured data and unstructured data (the latter called "user input"). Example (URL-decoded for readability):

yql=select * from sources * where artist contains "Coldplay" and userInput(@inp);&
The first line is the YQL query string, that has both structured input (artist=Coldplay) and a reference to unstructured user input. The user input is then given in the second line in the inp parameter.

Separating the structured data from the unstructured relieves the application code from interpreting and sanitizing the input data - it is essentially a blob. Vespa can then use heuristics to deduct the user's intention. User input can also be expressed in the simple query language using the userQuery operator.

Finally, input data can also be ranking query features - here the query feature is called user_profile. Query features are data, normally valid for this particular instance, that are used in the subsequent document ranking - this enables online decision making based on realtime data.

See query execution below. Use search chains to implement custom query processing. This is the integration point to plug in code to enrich a query - example: Look up user profile data from a user ID in the request. Set &tracelevel=2 to inspect the search chain components.

An application can have multiple content clusters - Vespa searches in all by default. Federation controls how to query the clusters, sources names the clusters

Query Profiles

Use query profiles to store query parameters in configuration. This makes query strings shorter, and makes it easy to modify queries by modifying configuration only. Use cases are setting query properties for different markets, parameters that do not change, and so on. Query profiles can be nested, versioned and use inheritance.

Geo Filter and Ranking

Filter by position using latitude and longitude to implement geo search. User a square bounding box, or a position and radius. DistanceToPath is a rank function based on closeness. Using ranking can often improve results instead of geo filtering.

Ranking and Grouping

Ranking specifies the computation of the query and data. It assigns scores to documents, and returns documents ordered by score. A rank profile is a specification for how to compute a document's score. An application can have multiple rank profiles, to run different computations. Example, query specifies rank_albums, with its schema definition:


rank-profile rank_albums inherits default {
    first-phase {
        expression: sum(query(user_profile) * attribute(category_scores))
Results can be ordered using sorting instead of ranking.

Grouping is a way to group documents in the result set after ranking. Example, return max 3 albums per artist, grouped on artist:

| all(group(artist) each(max(3) each(output(summary())) ) );
Fields used in grouping (here: artist) must be attributes. The grouping expression is part of the QYL query string, appended at the end.

Applications can group all documents (select all documents in YQL). Using hits=0 will return grouping results only.

The above rank profile does not do text ranking - there are however such profiles built-in. Text search and ranking is described in more detail in Text Matching and Ranking - find information about normalizing, prefix search and linguistics there.


All fields are returned in results by default. To specify a subset of fields, use document summaries. When searching text, having a static abstract of the document in a field, or using a dynamic summary can both improve the visual relevance of the search, as well as cut bandwidth used.

The default output format is JSON. Write a custom Renderer to generate results in other formats.

Read more on request-response processing - use this to write code to manipulate results.

Query execution


  1. Query processing: Normalizations, rewriting and enriching. Custom logic in search chains
  2. Matching, ranking and grouping/aggregation: This phase dispatches the query to content nodes
  3. Result processing, rendering: Content fetching and snippeting of the top global hits found in the query phase
The above is an simplification - if the query also specifies result grouping, the query phase might involve multiple phases or roundtrips between the container and content nodes.

Query processing and dispatch

  1. A query is sent from a front-end application to a container node using the Query API or in any custom request format handled by a custom request handler, which translates the custom request format to native Vespa APIs.
  2. Query pre-processing, like linguistic processing and query rewriting, is done in built-in and custom chains - see searcher development. The default search chain is vespa - find installed components in this chain by inspecting ApplicationStatus like in the quick-start. Adding &tracelevel=4 (or higher) to the query will emit the components invoked in the query, and is useful to analyze ordering.
  3. The query is sent from the container to the content cluster - see federation for more details. The illustration above has one content cluster but multiple is fully supported and allows scaling document types differently. E.g. a tweet document type can be indexed in a separate content cluster from a user document type, enabling independent scaling of the two.

Matching, ranking, grouping

  1. At this point the query enters one or more content clusters. In a content cluster with grouped distribution, the query is dispatched to one group at a time using a dispatch policy, while with a flat single group content cluster the query is dispatched to all content nodes.
  2. The query arrives at the content nodes which performs matching, ranking and aggregation/grouping over the set of documents in the Ready sub database. vespa-proton does matching over the ready documents and ranks as specified with the request/schema. Each content node matches and ranks a subset of the total document corpus and returns the hits along with meta information like total hits and sorting and grouping data, if requested.

  3. Once the content nodes within the group have replied within the timeout, max-hits / top-k results are returned to the container for query phase result processing. In this phase, the only per hit data available is the internal global document id (gid) and the ranking score. There is also result meta information like coverage and total hit count. Additional hit specific data, like the contents of fields, is not available until the result processing phase has completed the content fetching.

Result processing (fill) phase

  1. When the result from the query phase is available, a custom chained searcher component can process the limited data available from the first search phase before contents of the hits is fetched from the content nodes. The fetching from content nodes is lazy and is not invoked before rendering the response, unless asked for earlier by a custom searcher component.
  2. Only fields in the requested document summaries is fetched from content nodes. The summary request goes directly to the content nodes that produced the result from the query phase.
  3. After the content node requests have completed, the full result set can be processed further by custom components (e.g. doing result deduping, top-k re-ranking), before rendering the response.


Use GET or POST. Parameters can either be sent as GET-parameters or posted as JSON, these are equivalent:

$ curl -H "Content-Type: application/json" \
    --data '{"yql" : "select * from sources * where default contains \"coldplay\";"}' \

$ curl http://localhost:8080/search/?yql=select+%2A+from+sources+%2A+where+default+contains+%22coldplay%22%3B

Using POST

The general form of POST data is:

    param1 : value1,
    param2 : value2,
The format is based on the Query API reference, and has been converted from the flat dot notation to a nested JSON-structure.
  • The request-method must be POST and the Content-Type must be "application/json".
  • Also try the GUI for building queries at localhost:8080/querybuilder/ (with Vespa running). This helps building queries, with e.g. autocompletion of YQL, pasting of already built queries and conversion of JSON- to URL-queries.
    "yql" : "select * from sources * where default contains \"coldplay\";",
    "offset"  : 5,
    "ranking" : {
        "matchPhase" : {
            "ascending" : true,
            "maxHits"   : 15
    "presentation" : {
        "bolding" : false,
        "format"  : "json"
Note: Security filters can block GET and POST requests differently. This can block POSTed queries.


The Search Container uses Jetty for HTTP. Configure the http server - e.g. set requestHeaderSize to configure URL length (including headers):

<container version="1.0">
        <server port="8080" id="myserver">
            <config name="jdisc.http.connector">
HTTP keepalive is supported.

Values must be encoded according to standard URL encoding. Thus, space is encoded as +, + as %2b and so on - see RFC 2396.

HTTP status codes are found in the Query API reference. Also see Stack Overflow question.


If Vespa cannot generate a valid search expression from the query string, it will issue the error message Null query. To troubleshoot, add &tracelevel=2 to the request. A missing yql parameter will also emit this error message.

Query tracing

Use query tracing to debug query execution. Enable by using tracelevel=1 (or higher). Add trace.timestamps=true for timing info for every searcher invoked. Example:


    trace: {
        children: [
                message: "No query profile is used"
                message: "Resolved properties: tracelevel=6 (value from request) yql=select * from sources * where default contains "hi"; (value from request) trace.timestamps=true (value from request) "
                message: "Invoking chain 'vespa' [com.yahoo.prelude.statistics.StatisticsSearcher@native -> com.yahoo.prelude.querytransform.PhrasingSearcher@vespa -> ... -> federation@native]"
            children: [
                    timestamp: 0,
                    message: "Invoke searcher 'com.yahoo.prelude.statistics.StatisticsSearcher in native'"
                    timestamp: 14,
                    message: "com.yahoo.prelude.statistics.StatisticsSearcher in native Dependencies{provides=[StatisticsSearcher, com.yahoo.prelude.statistics.StatisticsSearcher], before=[rawQuery], after=[]}"
In custom code, use Query.trace to add trace output.