Grouping Reference
Read the Vespa grouping guide first, for examples and an introduction to grouping  this is the Vespa grouping reference.
Refer to groupingSessionCache for grouping performance. Also note that using a multivalued attribute (such as an array of doubles) in a grouping expression affects performance. Such operations can hit a memory bandwidth bottleneck, particularly if the set of hits to be processed is large, as more data is evaluated.
group
Group query results using a custom expression (using the group
clause):
 A numerical constant
 A document attribute

A function over another expression (
xorbit
,md5
,cat
,xor
,and
,or
,add
,sub
,mul
,div
,mod
) or any other expression  The data type of an expression is resolved using best effort, similarly to how common programming languages does to resolve arithmetics of different data typed operands

The results of any expression are either scalar or single dimension arrays
add(<array>)
adds all elements together to produce a scalar
add(<arrayA>, <arrayB>)
adds each element together producing a new array whose size ismax(<arrayA>, <arrayB>)
Groups can contain subgroups (by using each
and group
operations),
and may be nested to any level.
Multiple subgroupings or outputs can be created under the same group level, using multiple parallel each
or all
clauses, and each one may be labelled using as(mylabel)
.
When grouping results, groups that contain outputs, group lists and hit lists are generated. Group lists contain subgroups, and hit lists contain hits that are part of the owning group.
The identity of a group is held by its id. Scalar identities such as long, double and string, are directly available from the id, whereas range identities used for bucket aggregation are separated into the subnodes from and to. Refer to the result format reference.
Multivalue attributes
A multivalue attribute is a weighted set, array or map. Most grouping functions will just handle the elements of multivalued attributes separately, as if they were all individual values in separate documents. The following syntax can be used when grouping on map attribute fields.
Group on map keys:
all( group(mymap.key) each(output(count())) )Group on values for key my_key:
all( group(my_map{"my_key"}) each(output(count())) )Group on struct field my_field referenced in map element my_key:
all( group(my_map{"my_key"}.my_field) each(output(count())) )The key can either be specified directly (above) or indirectly via a key source attribute. The key is retrieved from the key source attribute for each document. Note that the key source attribute must be single value and have the same data type as the key type of the map. Using a key source attribute is not supported for streaming search:
all( group(my_map{attribute(my_key_source)}) each(output(count())) )
Tensors can not be used in grouping.
order
Each level of grouping may specify how to order its groups (using order
):
 Ordering can be done using any of the available aggregates
 Multilevel grouping allows strict ordering where primary aggregates may be equal
 Ordering is either ascending or descending, specified per level of ordering
 Groups are sorted using locale aware sorting
Limit the number of groups returned for each level using max
,
returning only first n groups as specified by order
:

order
changes ordering of groups after a merge operation for the following aggregators:count
,avg
andsum

order
will not change ordering of groups after a merge operation whenmax
ormin
is used 
Default order,
max(relevancy())
, does not require use ofprecision
continuations
Pagination of grouping results are managed by continuations
.
These are opaque objects that can be combined and resubmitted using the continuations
annotation
on the grouping step of the query to move to the previous or next page in a result list.
All root groups contain a single this continuation per select
.
That continuation represents the current view, and if submitted as the sole continuation,
it will reproduce the exact same result as the one that contained it.
There are zero or one prev/next continuation per group and hit list. Submit any number of these to retrieve the next/previous pages of the corresponding lists
Any number of continuations can be combined in a query, but the first must always be the this continuation. E.g. one may simultaneously move both to the next page of one list, and the previous page of another.
Note: If more than one continuation object are provided for the same group or hitlist, the one given last is the one that takes effect. This is because continuations are processed in the order given, and they replace whatever continuations they collide with.
If working programmatically with grouping, find the
Continuation
objects within
RootGroup
,
GroupList
and
HitList
result objects. These can then be added back into the continuation list of the
GroupingRequest
to paginate.
Refer to the grouping guide for an example.
Labels
Lists created using the each
keyword can be assigned a label
using the construct each(...) as(mylabel)
.
The outputs created by that each clause will be identified by this label.
where(true)
Only results matching the query are grouped, also in streaming search  with one exception:
Using where(true)
in the grouping expression groups all documents
matching the selection string
in the search api. Example:
all( where(true) all(group(myfield) each(output(count()))) )When using
where(true)
, relevancy is not calculated for groups, as only matched hits have relevance.
Refer to streaming search grouping for an example.
Aggregators
Each level of grouping specifies a set of aggregates to collect for all documents
that belong to that group (using the output
operation):
 The documents in a group, retrieved using a specified summary class
 The count of documents in a group
 The sum, average, min, max, xor or standard deviation of an expression
When using order
, aggregators can also be used in expressions in order to get increased control over group sorting.
This does not work with expressions that takes attributes as an argument, unless the expression is enclosed within an aggregator.
Using sum, max on a multivalued attribute:
Doing an operation such as output(sum(myarray))
will run the sum over each element value in each document.
The result is the sum of sums of values.
Similarly max(myarray)
will yield the maximal element over all elements in all documents, and so on.
Multivalue fields such as maps, arrays can be used for grouping. However, using aggregation functions such as sum() on such fields can give misleading results. Assume a map from strings to integers, where the strings are some sort of key to use for grouping. The following expression will provide the sum of the values for all keys:
all( group(mymap.key) each(output(sum(mymap.value))) )and not the sum of the values within each key, as one would expect. It is still, however, possible to run the following expression to get the sum of values within a specific key:
all( group(mymap{"foo"}) each(output(sum(mymap.value))) )
Group list aggregators  
Name  Description  Arguments  Result 

count  Counts the number of unique groups (as produced by group ).
Note that count operates independently of max and that this count is an estimate using HyperLogLog++ which is an algorithm for the countdistinct problem  None  Long 
Group aggregators  
Name  Description  Arguments  Result 
count  Increments a long counter everytime it is invoked  None  Long 
sum  Sums the argument over all selected documents  Numeric  Numeric 
avg  Computes the average over all selected documents  Numeric  Numeric 
min  Keeps the minimum value of selected documents  Numeric  Numeric 
max  Keeps the maximum value of selected documents  Numeric  Numeric 
xor  XOR the values (their least significant 64 bits) of all selected documents  Any  Long 
stddev  Computes the population standard deviation over all selected documents  Numeric  Double 
Hit aggregators  
Name  Description  Arguments  Result 
summary  Produces a summary of the requested summary class  Name of summary class  Summary 
Expressions
Arithmetic expressions  
Name  Description  Arguments  Result 

add  Add the arguments together  Numeric+  Numeric 
+  Add left and right argument  Numeric, Numeric  Numeric 
mul  Multiply the arguments together  Numeric+  Numeric 
*  Multiply left and right argument  Numeric, Numeric  Numeric 
sub  Subtract second argument from first, third from result, etc  Numeric+  Numeric 
  Subtract right argument from left  Numeric, Numeric  Numeric 
div  Divide first argument by second, result by third, etc  Numeric+  Numeric 
/  Divide left argument by right  Numeric, Numeric  Numeric 
mod  Modulo first argument by second, result by third, etc  Numeric+  Numeric 
%  Modulo left argument by right  Numeric, Numeric  Numeric 
neg  Negate argument  Numeric  Numeric 
  Negate right argument  Numeric  Numeric 
Bitwise expressions  
Name  Description  Arguments  Result 
and  AND the arguments in order  Long+  Long 
or  OR the arguments in order  Long+  Long 
xor  XOR the arguments in order  Long+  Long 
String expressions  
Name  Description  Arguments  Result 
strlen  Count the number of bytes in argument  String  Long 
strcat  Concatenate arguments in order  String+  String 
Type conversion expressions  
Name  Description  Arguments  Result 
todouble  Convert argument to double  Any  Double 
tolong  Convert argument to long  Any  Long 
tostring  Convert argument to string  Any  String 
toraw  Convert argument to raw  Any  Raw 
Raw data expressions  
Name  Description  Arguments  Result 
cat  Cat the binary representation of the arguments together  Any+  Raw 
md5  Does an md5 over the binary representation of the argument, and keeps the lowest 'width' bits  Any, Numeric(width)  Raw 
xorbit  Does an xor of 'width' bits over the binary representation of the argument. Width is rounded up to a multiple of 8  Any, Numeric(width)  Raw 
Accessor expressions  
Name  Description  Arguments  Result 
relevance  Return the computed rank of a document  None  Double 
docidnsspecific  Return the docid without namespace.
Applies only to streaming search
all( group(docidnsspecific()) each(output(count())) )  None  String 
<attributename>  Return the value of the named attribute  None  Any 
array.at 
Array element access.
The expression array.at(myarray, idx) returns one value per document
by evaluating the idx expression and using it as an index into the array.
The expression can then be used to build bigger expressions such as
output(sum(array.at(myarray, 0)))
which will sum the first element in the array of each document.
 Array, Numeric  Any 
interpolatedlookup 
Counts elements in a sorted array that are less than an expression,
with linear interpolation if the expression is between element values.
The operation
When the lookup argument's value is between two consecutive array element values,
the returned position will be a linear interpolation between their respective indexes.
The return value is always in the range
Assume
 Array, Numeric  Numeric 
Bucket expressions  
Name  Description  Arguments  Result 
fixedwidth  Maps the value of the first argument into consecutive buckets whose width equals the second argument  Any, Numeric  NumericBucketList 
predefined  Maps the value of the first argument into the given buckets. Standard mathematical start and end specifiers may be used to define the width of a bucket. The "(" and ")" evaluates to "[" and ">" by default  Any, Bucket+  BucketList 
Time expressionsUse the query parameter "timezone" to set the timezone to use when running these expressions. E.g.&timezone=GMT1 . Refer to Sun's
TimeZone reference
 
Name  Description  Arguments  Result 
time.dayofmonth  Returns the day of month (131) for the given timestamp  Long  Long 
time.dayofweek  Returns the day of week (06) for the given timestamp, Monday being 0  Long  Long 
time.dayofyear  Returns the day of year (0365) for the given timestamp  Long  Long 
time.hourofday  Returns the hour of day (023) for the given timestamp  Long  Long 
time.minuteofhour  Returns the minute of hour (059) for the given timestamp  Long  Long 
time.monthofyear  Returns the month of year (112) for the given timestamp  Long  Long 
time.secondofminute  Returns the second of minute (059) for the given timestamp  Long  Long 
time.year  Returns the full year (e.g. 2009) of the given timestamp  Long  Long 
time.date  Returns the date (e.g. 20090110) of the given timestamp  Long  Long 
List expressions  
Name  Description  Arguments  Result 
size  Return the number of elements in the argument if it is a list. If not return 1  Any  Long 
sort  Sort the elements in argument in ascending order if argument is a list If not it is a NOP  Any  Any 
reverse  Reverse the elements in the argument if argument is a list If not it is a NOP  Any  Any 
Other expressions  
Name  Description  Arguments  Result 
zcurve.x 
Returns the X component of the given zcurve encoded 2d point.
All fields of type "position" have an accompanying "<fieldName>_zcurve" attribute
that can be decoded using this expression, e.g. zcurve.x(foo_zcurve)
 Long  Long 
zcurve.y  Returns the Y component of the given zcurve encoded 2d point  Long  Long 
uca 
Converts the attribute string using unicode collation algorithm.
Groups are sorted using locale aware sorting, with the default and primary strength values, respectively:
all( group(s) order(max(uca(s, "sv"))) each(output(count())) ) all( group(s) order(max(uca(s, "sv", "PRIMARY"))) each(output(count())) ) 
Any, Locale(String), Strength(String)  Raw 
Single argument standard mathematical expressionsThese are the standard mathematical functions as found in the Java Math class.  
Name  Description  Arguments  Result 
math.exp  Double  Double  
math.log  Double  Double  
math.log1p  Double  Double  
math.log10  Double  Double  
math.sqrt  Double  Double  
math.cbrt  Double  Double  
math.sin  Double  Double  
math.cos  Double  Double  
math.tan  Double  Double  
math.asin  Double  Double  
math.acos  Double  Double  
math.atan  Double  Double  
math.sinh  Double  Double  
math.cosh  Double  Double  
math.tanh  Double  Double  
math.asinh  Double  Double  
math.acosh  Double  Double  
math.atanh  Double  Double  
Dual argument standard mathematical expressions  
Name  Description  Arguments  Result 
math.pow  Return X^Y.  Double, Double  Double 
math.hypot  Return length of hypotenuse given X and Y sqrt(X^2 + Y^2)  Double, Double  Double 
Select parameter language grammar
request ::= group [ "where" "(" ( "true"  "$query" ) ")" ] group ::= ( "all"  "each") "(" operations ")" [ "as" "(" identifier ")" ] operations ::= [ "group" "(" expression ")" ] ( ( "alias" "(" identifier "," expression ")" )  ( "max" "(" number ")" )  ( "order" "(" expList  aggrList ")" )  ( "output" "(" aggrList ")" )  ( "precision" "(" number ")" ) )* group* aggrList ::= aggr ( "," aggr )* aggr ::= ( ( "count" "(" ")" )  ( "sum" "(" exp ")" )  ( "avg" "(" exp ")" )  ( "max" "(" exp ")" )  ( "min" "(" exp ")" )  ( "xor" "(" exp ")" )  ( "stddev" "(" exp ")" )  ( "summary" "(" [ identifier ] ")" ) ) [ "as" "(" identifier ")" ] expList ::= exp ( "," exp )* exp ::= ( "+"  "") ( "$" identifier [ "=" math ] )  ( math )  ( aggr ) math ::= value [ ( "+"  ""  "*"  "/"  "%" ) value ] value ::= ( "(" exp ")" )  ( "add" "(" expList ")" )  ( "and" "(" expList ")" )  ( "cat" "(" expList ")" )  ( "div" "(" expList ")" )  ( "docidnsspecific" "(" ")" )  ( "fixedwidth" "(" exp "," number ")" )  ( "interpolatedlookup" "(" attributeName "," exp ")")  ( "math" "." ( ( "exp"  "log"  "log1p"  "log10"  "sqrt"  "cbrt"  "sin"  "cos"  "tan"  "asin"  "acos"  "atan"  "sinh"  "cosh"  "tanh"  "asinh"  "acosh"  "atanh" ) "(" exp ")"  ( "pow"  "hypot" ) "(" exp "," exp ")" ))  ( "max" "(" expList ")" )  ( "md5" "(" exp "," number "," number ")" )  ( "min" "(" expList ")" )  ( "mod" "(" expList ")" )  ( "mul" "(" expList ")" )  ( "or" "(" expList ")" )  ( "predefined" "(" exp "," "(" bucket ( "," bucket )* ")" ")" )  ( "reverse" "(" exp ")" )  ( "relevance" "(" ")" )  ( "sort" "(" exp ")" )  ( "strcat" "(" expList ")" )  ( "strlen" "(" exp ")" )  ( "size" "(" exp")" )  ( "sub" "(" expList ")" )  ( "time" "." ( "date"  "year"  "monthofyear"  "dayofmonth"  "dayofyear"  "dayofweek"  "hourofday"  "minuteofhour"  "secondofminute" ) "(" exp ")" )  ( "todouble" "(" exp ")" )  ( "tolong" "(" exp ")" )  ( "tostring" "(" exp ")" )  ( "toraw" "(" exp ")" )  ( "uca" "(" exp "," string [ "," string ] ")" )  ( "xor" "(" expList ")" )  ( "xorbit" "(" exp "," number ")" )  ( "zcurve" "." ( "x"  "y" ) "(" exp ")" )  ( attributeName "." "at" "(" number ")")  ( attributeName ) bucket ::= "bucket" ( "("  "["  "<" ) ( "inf"  rawvalue  number  string ) [ "," ( "inf"  rawvalue  number  string ) ] ( ")"  "]"  ">" ) rawvalue ::= "{" ( ( string  number ) "," )* "}"