• [+] expand all

Document Summaries

Use document summaries to configure which fields to include in results. The default summary contains all fields that are possible to include in summaries; all other summaries will contain a subset of the fields included in the default summary.

The default summary class will always access the document store because it includes the document ID which is stored here. To include the document ID in a custom summary class, add a field for the id and include it in the summary class.

Use dynamic to generate dynamic abstracts of fields, based on search keywords.

Defining summary sets in the schema

Define additional summary sets as described in the schema reference.

Example: the title and year fields are included in a the titleyear summary.

schema music {

    document music {
        field title type string {
            indexing: summary | index
        }
        field artist type string {
            indexing: summary | attribute | index
        }
        field year type int {
            indexing: summary | attribute
        }
        field popularity type int {
            indexing: summary | attribute
        }
        field url type uri {
            indexing: summary | index
        }
    }

    document-summary titleyear {
        summary title type string {
            source: title
        }
        summary year type int {
            source: year
        }
    }
}

For more details on summary properties, see the schema reference.

Using summaries in queries

Use presentation.summary=[summary name] in queries to select summary class (the default class is called default). See Query API. Example:

/search/?yql=select+*+from+sources+*+where+default+contains+"best"&presentation.summary=titleyear

The select statement in YQL lists a set of fields to return. Vespa in general makes a best-effort to return those fields, and only those fields, unless a wildcard ("*") is given as argument. The wildcard implies returning the full set of fields included in the given summary class.

In conjunction with YQL statements, the summary argument operates like a definition of the set which YQL select then chooses a subset of fields from.

In other words, if the YQL expression is "select * …", and the summary argument is titleyear, all the fields in the summary class titleyear will be returned. If the select statement lists one or more fields (and summary is titleyear), the summary class titleyear is fetched, and the fields not listed in the select statement will be stripped away.

Performance

When using additional summary classes to increase performance, only the network data size is changed - the data read from storage is unchanged. Having "debug" fields with summary enabled will hence also affect the amount of information that needs to be read from disk.

Vespa keeps attribute type fields in memory and fetches those fields from memory when requested as part of document summaries. This means summaries are memory-only operations if all fields are attributes. The other document fields are stored as blobs/records in the document store. This record is used when processing summary requests that include fields in this record, and as needed during visiting or re-distribution of content to handle elasticity.

See query execution - breakdown of the summary (a.k.a. result processing, rendering) phase:

  • The document summary latency on the content node, tracked by content_proton_search_protocol_docsum_latency_average.
  • Getting data across from content nodes to containers.
  • Deserialization from internal binary formats (potentially) to Java objects if touched in a Searcher, and finally serialization to JSON (default rendering) + rendering and network.

The work, and thus latency increases with more hits. Use query tracing to analyze performance.