Use YQL to select which fields to include in results:
$ vespa query "select artist, album from music where album contains 'head'"
{"root":{},"children":[{"id":"index:mycontentcluster/0/de97c3f0cf0d1122b3494a44","relevance":0.16343879032006284,"source":"mycontentcluster","fields":{"artist":"Coldplay","album":"A Head Full of Dreams"}}]}}
In addition to the schema fields,
one can also select sddocname
and documentid:
$ vespa query "select artist, album, documentid, sddocname from music where album contains 'head'"
{"root":{},"children":[{"id":"id:mynamespace:music::a-head-full-of-dreams","relevance":0.16343879032006284,"source":"mycontentcluster","fields":{"sddocname":"music","documentid":"id:mynamespace:music::a-head-full-of-dreams","artist":"Coldplay","album":"A Head Full of Dreams"}}]}}
Use * to select all fields.
As no summary class is given,
the default summary class with all fields is used:
$ vespa query "select * from music where album contains 'head'"
{"root":{},"children":[{"id":"id:mynamespace:music::a-head-full-of-dreams","relevance":0.16343879032006284,"source":"mycontentcluster","fields":{"sddocname":"music","documentid":"id:mynamespace:music::a-head-full-of-dreams","artist":"Coldplay","album":"A Head Full of Dreams","year":2015,"category_scores":{"type":"tensor<float>(cat{})","cells":{"pop":1.0,"rock":0.20000000298023224,"jazz":0.0}}}}]}}
Query performance depends on the fields returned, see the performance section.
Summary classes
Use summary classes - reference - to simplify the YQL,
by naming the fields in a class:
schema music {
document music {
field artist type string {
indexing: summary | index
}
field album type string {
indexing: summary | index
index: enable-bm25
}
field year type int {
indexing: summary | attribute
}
field category_scores type tensor<float>(cat{}) {
indexing: summary | attribute
}
}
document-summary short-summary {summary artist type string {}summary album type string {}}
}
$ vespa query "select * from music where album contains 'head'" \
"presentation.summary=short-summary"
The select statement in YQL lists a set of fields to return.
Vespa in general makes a best-effort to return those fields, and only those fields,
unless a wildcard ("*") is given as argument.
The wildcard implies returning the full set of fields included in the given summary class.
Note:
A good practice is to add the summary class to a
query profile.
Application logic can then use the query profile in queries, having both query parameters and summary class in one.
In conjunction with YQL statements,
the summary class argument operates like a definition of the set
which YQL select then chooses a subset of fields from.
In other words, if the YQL expression is select * …,
and the summary class argument is short-summary,
all the fields in the summary class short-summary will be returned.
The default summary class contains all schema fields
plus sddocname and documentid.
Summary field rename
Use a summary class to give a field another name in query results:
document-summary rename-summary {
summary artist_name type string {
source: artist
}
}
Use dynamic
to generate dynamic abstracts of fields, based on query keywords.
Example from Vespa Documentation Search - see the
schema:
document doc {
field content type string {
indexing: summary | index
summary : dynamic
}
A query for document summary returns:
Use document summaries to configure which fields ...
indexing: summary | index } } document-summary
titleyear { summary title type string ...
The example above creates a dynamic summary with the matched terms highlighted.
The latter is called bolding
and can be enabled independently of dynamic summaries.
Attribute fields are held in memory.
This means summaries are memory-only operations if all fields requested are attributes,
and is the optimal way to get high query throughput.
The other document fields are stored as blobs/records in the
document store.
Requesting these fields will therefore require a disk access, increasing latency.
Important:
The default summary class will access the document store
as it includes the documentid field
which is stored there.
For maximum query throughput using memory-only access, use a dedicated summary class with attributes only.
When using additional summary classes to increase performance,
only the network data size is changed - the data read from storage is unchanged.
Having "debug" fields with summary enabled will hence also affect the
amount of information that needs to be read from disk.
See query execution -
breakdown of the summary (a.k.a. result processing, rendering) phase:
Getting data across from content nodes to containers.
Deserialization from internal binary formats (potentially) to Java objects
if touched in a Searcher,
and finally serialization to JSON (default rendering) + rendering and network.
The work, and thus latency increases with more hits.
Use query tracing to analyze performance.