Document JSON Format

This document describes the JSON format used for sending document operations to Vespa. Field types are defined in the search definition reference. This is a reference for:

  • JSON representation of field types in Vespa documents
  • JSON representation of document operations (put, get, remove, update)
  • JSON representation of addressing fields for update, and update operations
Also refer to encoding troubleshooting.

Field types

string

"name": "Polly"

Feeding in an empty string ("") for a field will have the same effect as not feeding a value for that field, and the field will not be rendered in the document API and in document summaries.

int

"age": 42

long

"age": 42

bool

true or false:

"alive": false

byte

"tinynumber": 128

float

"weight": 123.4567

double

"weight": 123.4567

position

A position is encoded as a latitude;longitude string, valid formats:

  1. S22.4532;W123.9887 - default query/result format
  2. N72°23'52;E26°04'22
  3. N72o20.92;E26o08.54
Latitude is prefixed by N or S, and longitude by E or W. The angular measurement is expressed as degrees with a decimal fraction, or as degrees subdivided in minutes and seconds. It is also valid to express minutes with a decimal fraction, supporting regular GPS output format. Small letter o may be used as a replacement for degrees.

Document API

To input a location field using the document api, use the latitude;longitude string:

"location": "N37.401;W121.996"
When output in document api, the field is rendered as:
"location": {
    "y": 37401000,
    "x": -121996000
}
The X/Y coordinates are in millionths of degrees

Document summary

A position field configured as:

field location type position { indexing: attribute }
is rendered as:
"location.position": {
    "x": -121996000,
    "y": 37401000,
    "latlong": "N37.401000;W121.996000"
}
Adding summary:
field location type position { indexing: summary | attribute }
will render it as:
"location": {
    "x": -121996000,
    "y": 37401000
},
"location.position": {
    "x": -121996000,
    "y": 37401000,
    "latlong": "N37.401000;W121.996000"
}

If the request specifies a position, the distance to this position is caluclated and rendered in fieldname.distance. Find details in Geo search:

"location.position": {
    "x": -121996000,
    "y": 37401000,
    "latlong": "N37.401000;W121.996000"
},
"location.distance": 27488

predicate

A predicate is a string:

"predicate_field": "gender in [Female] and age in [20..30] and pos in [1..4]"

raw

The content of a raw field is represented as a base64-encoded string:

"raw_field": "VW5rbm93biBhcnRpc3QgZnJvbSB0aGUgbW9vbg=="

uri

A URI is a string:

"url": "https://www.yahoo.com/"

array

Arrays are represented as JSON arrays.

"int_array_field": [
    123,
    456,
    789
]

"string_array_field": [
    "item 1",
    "item 2",
    "item 3"
]

Feeding in an empty array ([]) for a field will have the same effect as not feeding a value for that field, and the field will not be rendered in the document API and in document summaries.

weightedset

Weighted sets are represented as maps where the value is the weight. Note, even if the key is not a string as such, it will be represented as a string in the JSON format.

"int_weighted_set": {
    "123": 2,
    "456": 78
}

"string_weighted_set": {
    "item 1": 143,
    "item 2": 6
}

Feeding in an empty weightedset ({}) for a field will have the same effect as not feeding a value for that field, and the field will not be rendered in the document API and in document summaries.

tensor

Tensor fields may be represented as an array of "cells":

"tensorfield": {
    "cells": [
        { "address": { "x": "a", "y": "0" }, "value": 2.0 },
        { "address": { "x": "a", "y": "1" }, "value": 3.0 },
        { "address": { "x": "b", "y": "0" }, "value": 4.0 },
        { "address": { "x": "b", "y": "1" }, "value": 5.0 }
    ]
}

This works for any tensor but is often unnecessarily verbose, which impacts performance when transferring large tensors. Therefore, a number of short forms are available. Use the shortest form applicable to your tensor type for the best possible performance.

Indexed tensors short form: May use a "values" array where the values are ordered in the standard value order, where indexes of dimensions to the right are incremented before indexes to the left, where dimensions are ordered alphabetically (such that, e.g with a tensor with dimensions x,y the "y" values for each value of "x" are adjacent):
"tensorfield": {
    "values": [ 2.0, 3.0, 5.0, 7.0 ]
}
Mixed tensors short form: May use a "blocks" array where the mapped dimensions are given as an address, and the dense values for each sparse address as a values array. For example, to specify the same tensor as in the "cells" example above:
"tensorfield": {
   "blocks":[
       {"address":{"x":"a"},"values":[2.0,3.0]},
       {"address":{"x":"b"},"values":[4.0,5.0]}
    ]
}

Short form for tensors with a single sparse dimension: May use an object to give cells or blocks where the sparse label is the key to each value. E.g for "cells":

"tensorfield": {
    "cells": {
        "a": 2.0,
        "b": 3.0
    }
}
and similarly for "blocks" (where this specifies the same tensor as the "blocks" example above):
"tensorfield": {
   "blocks":{
       "a":[2.0,3.0],
       "b":[4.0,5.0]
    }
}
struct

"mystruct": {
    "intfield": 123,
    "stringfield": "foo"
}

map

The JSON dictionary key must be a string, even if the map key type in the search definition is not a string:

"int_to_string_map": {
    "123": "foo",
    "456": "bar",
    "789": "foobar"
}

Feeding in an empty map ({}) for a field will have the same effect as not feeding a value for that field, and the field will not be rendered in the document API and in document summaries.

annotationreference

Annotation references do not have a JSON representation

reference

String with document ID refering to a parent document:

"artist_ref": "id:mynamespace:artists::artist-1"

Empty fields

In general, fields that have not received a value during feeding will be ignored when rendering the document. They are considered as empty fields. However, certain field types have some values which causes them to be considered empty. For instance, the empty string ("") is considered empty, as well as the empty array ([]). See the above table for more information for each type.

Document operations

Manage documents using:

  • Put - insert / overwrite document with same ID
  • Get - get a document by ID
  • Remove - remove a document by ID
  • Update - update fields in a document by ID. Updates are partial, submit new values for fields to update. If the document does not exist, a new is created document if create if nonexistent is used - otherwise, returns error. There are three basic operation on field level:
    • Assign: Replace the content of the field, by value or by using arithmetic operations
    • Add: Add a new value to a field (array, weightedset, tensor etc). Note: Messages can be re-sent by Vespa's Message Bus. This can cause unexpected results for all operations except assign and remove! If greater consistency is needed, use a conditional update instead.
    • Remove: Remove a value from a field.
    • Modify: Modify cells in tensors.
The are two methods for document operations:
  • Vespa HTTP client: This is a Java API / command line tool to feed asynchronous document operations to Vespa. Input is JSON with one or more document operations (for high throughput, batch operations). The document ID is hence in the JSON feed.
  • Document API: This synchronous API accepts one operation per request - the document ID is encoded in the URL

PUT

Vespa HTTP client:
{
    "put": "id:music:music::123",
    "fields": {
        "title": "Best of Bob Dylan"
    }
}
Document API:
http POST /document/v1/music/music/docid/123
{
    "fields": {
        "title": "Best of Bob Dylan"
    }
}

GET

Vespa HTTP client:
# not supported - use vespa-get
Document API:
http GET /document/v1/music/music/docid/123

REMOVE

Vespa HTTP client:
{
    "remove": "id:music:music::HitMe"
}
Document API:
http DELETE /document/v1/music/music/docid/123


UPDATE

Vespa HTTP client:
{
    "update": "id:music:music::123",
    "fields": {
        "title": {
            "assign": "The best of Bob Dylan"
        }
    }
}
Document API:
http PUT /document/v1/music/music/docid/123
{
    "fields": {
        "title": {
            "assign": "The best of Bob Dylan"
        }
    }
}

Test and set

An optional condition can be added to operations to specify a test and set condition. The value of the condition is a document selection encoded as a string. The put/update/remove operation is only applied if the condition matches an already existing document with that id. Example: Increment the sales field only if it is already equal to 999:

Vespa HTTP client:
{
    "update": "id:music:music::BestOf",
        "condition": "music.sales==999",
        "fields": {
            "sales": {
                "increment": 1
            }
    }
}
Document API:
http PUT /document/v1/music/music/docid/BestOf?condition=music.sales=='999'
{
   "fields": {
       "sales": {
           "increment": 1
        }
    }
}

Note: Use documenttype.fieldname in the condition, not only fieldname.

Note: If the condition is not met, an error is returned. ToDo: There is a discussion whether to change to not return error, and instead return a condition-not-met in the response.

Update

Vespa supports making changes to an existing document without submitting the full document. This is called partial update. A partial update consists of the id of the existing document to update, and the operation(s) to execute on the fields.

Documents can be auto-created on updates using create if nonexistent. All data structures (attribute, index and summary) are updatable. The following operations are supported:

All field types
  • assign (may also be used to clear fields)
Numeric field types
Composite types
Tensor types
  • modify Modify individual cells in a tensor. Can either replace, add or multiply cell values.
  • add Add cells to mapped or mixed tensors.
  • remove Remove cells from mapped or mixed tensors.

assign

assign is used to replace the value of a field (or an element of a collection) with a new value. When assigning, one can generally use the same syntax and structure as when feeding that field's value in a put operation.

Single value field

field title type string {
    indexing: summary
}
{
    "update": "id:mynamespace:music::example",
    "fields": {
        "title": {
            "assign": "The best of Bob Dylan"
        }
    }
}

Tensor field

field tensorfield type tensor(x{},y{}) {
    indexing: attribute | summary
}
{
    "update": "id:mynamespace:tensordoctype::example",
    "fields": {
        "tensorfield": {
            "assign": {
                "cells": [
                    { "address": { "x": "a", "y": "b" }, "value": 2.0 },
                    { "address": { "x": "c", "y": "d" }, "value": 3.0 }
                ]
            }
        }
    }
}

Struct field

Replacing all fields in a struct

A full struct is replaced by assigning an object of struct key/value pairs.

struct person {
    field first_name type string {}
    field last_name type string {}
}
field contact type person {
    indexing: summary
}
{
    "update": "id:mynamespace:workers::example",
    "fields": {
        "contact": {
            "assign": {
                "first_name": "Bob",
                "last_name": "The Plumber"
            }
        }
    }
}

Individual struct fields

Individual struct fields are updated using field path syntax. Refer to the reference for restrictions using structs.

{
    "update": "id:mynamespace:workers::example",
    "fields": {
        "contact.first_name": { "assign": "Bob" },
        "contact.last_name":  { "assign": "The Plumber" }
    }
}

Map field

Individual map entries can be updated using field path syntax. The following declaration defines a map where the key is an Integer and the value is a person struct.

struct person {
    field first_name type string {}
    field last_name type string {}
}
field contact type map<int, person> {
    indexing: summary
}
Example updating part of an entry in the contact map:
  • contact is the name of the map field to be updated
  • {0} is the key that is going to be updated
  • first_name is the struct field to be updated inside the person struct
{
    "update": "id:mynamespace:workers::example",
    "fields": {
       "contact{0}.first_name": { "assign": "John" }
    }
}
Assigning an element to a key in a map will insert the key/value mapping if it does not already exist, or overwrite it with the new value if it does exist. Refer to the reference for restrictions using maps.

Map to primitive value

field my_food_scores type map<string, string> {
    indexing: summary
}
{
    "update": "id:mynamespace:food::example",
    "fields": {
        "my_food_scores{Strawberries}": {
            "assign": "Delicious!"
        }
    }
}

Map to struct

struct contact_info {
    field phone_number type string {}
    field email type string {}
}
field contacts type map<string, contact_info> {
    indexing: summary
}
{
    "update": "id:mynamespace:people::d_duck",
    "fields": {
        "contacts{\"Uncle Scrooge\"}": {
            "assign": {
                "phone_number": "555-123-4567",
                "email": "number_one_dime_luvr1877@example.com"
            }
        }
    }
}

Array field

Individual array elements may be updated using field path or match syntax

Array of primitive values

field ingredients type array<string> {
    indexing: summary
}
{
    "update": "id:mynamespace:cakes:tasty_chocolate_cake",
    "fields": {
        "ingredients[3]": {
            "assign": "2 cups of flour (editor's update: NOT asbestos!)"
        }
    }
}
Alternatively:
{
    "update": "id:mynamespace:cakes:tasty_chocolate_cake",
    "fields": {
        "ingredients": {
            "match" {
                "element": 3,
                "assign": "2 cups of flour (editor's update: NOT asbestos!)"
            }
        }
    }
}

Array of struct

Refer to the reference for restrictions using array of structs.

struct person {
    field first_name type string {}
    field last_name type string {}
}
field people type array<person> {
    indexing: summary
}
{
    "update": "id:mynamespace:students:example",
    "fields": {
        "people[34]": {
            "assign": {
                "first_name": "Bobby",
                "last_name": "Tables"
            }
        }
    }
}
Alternatively:
{
    "update": "id:mynamespace:students:example",
    "fields": {
        "people": {
            "match": {
                "element": 34,
                "assign": {
                     "first_name": "Bobby",
                     "last_name": "Tables"
                }
            }
        }
    }
}

Weighted set field

Adding new elements to a weighted set can be done using add, or by assigning with field{key} syntax. Example of the latter:

field int_weighted_set type weightedset<int> {
    indexing: summary
}
field string_weighted_set type weightedset<string> {
    indexing: summary
}
{
    "update":"id:weightedsetdoctype:weightedsetdoctype::example1",
    "fields": {
        "int_weighted_set{123}": {
            "assign": 123
        },
        "int_weighted_set{456}": {
            "assign": 100
        },
        "string_weighted_set{\"item 1\"}": {
            "assign": 144
        },
        "string_weighted_set{\"item 2\"}": {
            "assign": 7
        }
    }
}
Note that using the field{key} syntax for weighted sets may be less efficient than using add.

Clearing a field

To clear a field, assign a null value to it.

{
    "update": "id:mynamespace:music::example",
    "fields": {
        "title": {
            "assign": null
        }
    }
}

add

add is used to add entries to arrays, weighted sets or to the mapped dimensions of tensors.

Adding array elements

The added entries are appended to the end of the array in the order specified.

field tracks type array<string> {
    indexing: summary
}
{
    "update": "id:mynamespace:music::http://music.yahoo.com/bobdylan/BestOf",
    "fields": {
       "tracks": {
            "add": [
                "Lay Lady Lay",
                "Every Grain of Sand"
            ]
        }
    }
}

Add weighted set entries

Add weighted set elements by using a JSON key/value syntax, where the value is the weight of the element.

Adding a key/weight mapping that already exists will overwrite the existing weight with the new one.

field int_weighted_set type weightedset<int> {
    indexing: summary
}
field string_weighted_set type weightedset<string> {
    indexing: summary
}
{
    "update":"id:weightedsetdoctype:weightedsetdoctype::example1",
    "fields": {
        "int_weighted_set":  {
            "add": {
                "123": 123,
                "456": 100
            }
        },
        "string_weighted_set": {
            "add": {
                "item 1": 144,
                "item 2": 7
            }
        }
    }
}

Add tensor cells

Add cells to mapped or mixed tensors. Invalid for tensors with only indexed dimensions. Adding a cell that already exists will overwrite the cell value with the new value. The address must be fully specified, but cells with bound indexed dimensions not specified will receive the default value of 0.0.

field tensorfield type tensor(x{},y[3]) {
    indexing: attribute | summary
}
{
    "update": "id:mynamespace:tensordoctype::example",
    "fields": {
        "tensorfield": {
            "add": {
                "cells": [
                    { "address": { "x": "b", "y": "0" }, "value": 2.0 },
                    { "address": { "x": "b", "y": "1" }, "value": 3.0 }
                ]
            }
        }
    }
}

In this example, cell {"x":"b","y":"2"} will implicitly be set will value 0.0.

remove

Remove elements from weighted sets, maps and tensors with remove.

Weighted set field

field string_weighted_set type weightedset<string> {
    indexing: summary
}
{
    "update":"id:mynamespace:weightedsetdoctype::example1",
    "fields":  {
        "string_weighted_set": {
            "remove": {
                "item 2": 0
            }
        }
    }
}

Map field

field string_map type map<string, string> {
    indexing: summary
}
{
    "update":"id:mynamespace:mapdoctype::example1",
    "fields":  {
        "string_map{item 2}": {
            "remove": 0
        }
    }
}

Tensor field

Removes cells from mapped or mixed tensors. Invalid for tensors with only indexed dimensions. Only mapped dimensions should be specified for tensors with both mapped and indexed dimensions, as all indexed cells the mapped dimensions point to will be removed implicitly.

field tensorfield type tensor(x{},y[2]) {
    indexing: attribute | summary
}
{
    "update": "id:mynamespace:tensordoctype::example",
    "fields": {
        "tensorfield": {
            "remove": {
                "addresses": [
                    {"x": "b"},
                    {"x": "c"}
                ]
            }
        }
    }
}

In this example, cells {x:b,y:0},{x:b,y:1},{x:c,y:0},{x:c,y:1} will be removed.

Arithmetic

The four arithmetic operators increment, decrement, multiply and divide are used to modify single value numeric values without having to look up the current value before applying the update. Example:

field sales type int {
    indexing: summary | attribute
}
{
    "update": "id:music:music::http://music.yahoo.com/bobdylan/BestOf",
    "fields": {
        "sales": {
            "increment": 1
        }
    }
}

match

If an arithmetic operation is to be done for a specific key in a weighted set or array, use the match operation:

field track_popularity type weightedset<string> {
    indexing: summary | attribute
}
{
    "update": "id:music:music::http://music.yahoo.com/bobdylan/BestOf",
    "fields": {
        "track_popularity": {
            "match": {
                "element": "Lay Lady Lay",
                "increment": 1
            }
        }
    }
}
In other words, for the weighted set "track_popularity", match the element "Lay Lady Lay", then increment its weight by 1.

If the updated field were an array, the element value would be a positive integer.

Note: only one element can be matched per operation.

Modify tensors

Individual cells in tensors can be modified using the modify update. The cells are modified according to the given operation:

  • replace - replaces cell values
  • add - adds a value to the cell
  • multiply - multiples a value with the cell

Cells must be fully specified. If the cell does not exist, the update for that cell will be ignored.

field tensorfield type tensor(x[3]) {
    indexing: attribute | summary
}
{
    "update": "id:mynamespace:tensordoctype::example",
    "fields": {
        "tensorfield": {
            "modify": {
                "operation": "replace",
                "addresses": [
                    { "address": { "x": "1" }, "value": 7.0 },
                    { "address": { "x": "2" }, "value": 8.0 }
                ]
            }
        }
    }
}

create (create if nonexistent)

Updates to nonexistent documents are supported using the create field. An empty document is created on the content nodes before the update is applied. This simplifies client code in the case of multiple writers:

{
    "update": "id:mynamespace:music::http://music.yahoo.com/bobdylan/BestOf",
    "create": true,
    "fields": {
        "title": {
            "assign": "The best of Bob Dylan"
        }
    }
}
Java example using the Document API:
public DocumentUpdate createUpdate(DocumentType musicType) {
    DocumentUpdate update = new DocumentUpdate(musicType, "id:mynamespace:music::http://music.yahoo.com/bobdylan/BestOf");
    update.setCreateIfNonExistent(true);
    return update;
}
create may be used in combination with condition. If the document does not exist, the condition will be ignored and a new document with the update applied is automatically created. Otherwise, the condition must match for the update to take place.

Caution: if all existing replicas of a document are missing when an update with create: true is executed, a new document will always be created. This happens even if a condition has been given. If the existing replicas become available later, their version of the document will be overwritten by the newest update since it has a higher timestamp.

fieldpath

Fieldpath is for accessing fields within composite structures - for structures that are not part of index or attribute, it is possible to access elements directly using fieldpaths. This is done by adding more information to the field value. For map structures, specify the key (see example).

mymap{mykey}
and then do operation on the element which is keyed by "mykey". Arrays can be accessed as well (see details).
myarray[3]
And this is also true for structs (see details). Note: Struct updates do not work for index mode:
mystruct.value1
This also works for nested structures, e.g. a map of map to array of struct:
{
    "update": "id:mynamespace:complexdoctype::foo",
    "fields": {
        "nested_structure{firstMapKey}{secondMapKey}[4].title": {
            "assign": "Look at me, mom! I'm hiding deep in a nested type!"
        }
    }
}