Querying Vespa

Search request are sent to the Vespa search API and queries are written in YQL. The following diagram illustrates the components and data flow for Vespa queries.

Query Flow

A query is executed with two main phases:

  1. The search phase which does matching, ranking and grouping/aggregation. This phase involves dispatching the query out to content nodes
  2. The result processing phase which involves content fetching and snippeting of the top global hits found in the previous search phase

The above is an simplification as if the search request also specifies result grouping the search phase might involve multiple phases or round trips between the container and the content nodes

Life of a Query

The search phase

  1. A query is sent from a front-end application to a container node over HTTP(s) using GET or POST using the search API or in any custom request format handled by a custom request handler which translates the custom request format to native Vespa APIs.
  2. Query pre-processing, like linguistic processing and query rewriting, is done in configured search chains
  3. The query is sent from the container to selected content clusters - see federation for more details. The illustration above only contains one content cluster but multiple is fully supported and allows scaling document types differently. For example a tweet document type could be indexed in a separate content cluster then user document type which allow for independent scaling.
  4. Content clusters have top-level dispatchers (TLDs) which dispatches the query to content groups within the content cluster.
  5. At this point the query enters one or more content clusters. In a content cluster with grouped data distribution the top-level dispatcher will dispatch the query to one group at a time using a dispatch policy while with a flat single group content cluster it will dispatch the query to all content nodes.
  6. The query arrives at the content nodes which performs matching, ranking and aggregation/grouping over the set of indexed documents in the Ready sub database: The vespa-proton process performs matching over the ready and indexed documents and performs ranking as specified with the request/document definition. Each content node matches and ranks a subset of the total document corpus and returns the top k hits along with meta information like total hits and sorting and grouping data if requested upward to the top-level dispatcher.
  7. Once the content nodes within the group has replied within the timeout the top-level dispatch will merge the results and return the merged result up to the calling search container for search phase result processing. Only the targeted number of hits is returned. In this phase the only per hit data available is the Vespa internal global document id (gid) and the relevancy score. In addition there is result meta information like coverage and total hit count. Additional hit specific data like the contents of fields is not available until the result processing phase has completed the content fetching.

The result processing (fill) phase

  • When the result from the search phase is available a custom chained searcher component can process the limited data available from the first search phase before contents of the hits is fetched from the content nodes. The fetching from content nodes is lazy and is not invoked before rendering the response unless asked for earlier by a custom searcher component.
  • The container will ask the content nodes which produced the best hits from search phase for the contents of the fields, only fields in the requested documenet summaries is fetched. The summary request goes directly to the content nodes that produced the result from the search phase.
  • After the phase is completed against the content nodes the complete result set might be processed further by custom components (e.g doing result deduping, top k re-ranking) before finally rendering the response.