Document Operation API

This document explains the JSON REST API for put, update, remove, get and visit operations to a Vespa cluster. It is not a high throughput API, but designed for ease of use. Read the introduction. To enable the API, add document-api in services.xml:


Document Format

The structure is based on the document format, but the document ID is moved to the request URI instead of the document itself. The document fields must match the search definition. A request looks like:

[PUT/POST/DELETE/GET]: [host]:[port]/document/v1/{namespace}/{document-type}/docid/[{user-specified}]
Documents can be grouped:
[PUT/POST/DELETE/GET]: [host]:[port]/document/v1/{namespace}/{document-type}/[group|number]/{group name or number value}/[{user-specified}]
Example, inserting a document:
POST: [host]:[port]/document/v1/music/music/docid/123

    "fields": {
        "songs": "Knockin on Heaven's Door; Mr. Tambourine Man",
        "title": "Best of Bob Dylan",
        "url": ""


Insert a new document, using HTTP POST. Optional parameters: condition and route.

$ curl -X POST -H "Content-Type:application/json" --data-binary @document-1.json http://hostname:8080/document/v1/music/music/docid/1


Update a document, using HTTP PUT. Optional parameters: condition, create, and route.

$ curl -X PUT -H "Content-Type:application/json" --data-binary @update.json http://hostname:8080/document/v1/music/music/docid/1


Delete a document, using HTTP DELETE. Optional parameters: condition and route.

$ curl -X DELETE http://hostname:8080/document/v1/music/music/docid/1


Get one or more documents, using HTTP GET:

$ curl http://hostname:8080/document/v1/music/music/docid/1

If docid is not specified, selection is supported and a set of documents is returned using visiting:

$ curl http://hostname:8080/document/v1/music/music/docid?selection=music.genre=='blues'

Visit all documents for a group:

$ curl http://hostname:8080/document/v1/music/some-type-with-group/group/yellow/

A specific document in a group:

$ curl http://hostname:8080/document/v1/music/some-type-with-number/number/23/some_key

Visit documents across all document types and namespaces stored in content cluster mycluster:

$ curl http://hostname:8080/document/v1/?cluster=mycluster

A Document API request can only retrieve data from one cluster, so cluster must be specified for requests at the root /document/v1/ level. This is required even if you just have a single content cluster in your application.

Note: when visiting across all document types, some internal document fields set by Vespa may be returned as part of the results. To avoid this, limit visiting to just one document type using selection and explicitly filter these internal fields away using fieldSet:

$ curl http://hostname:8080/document/v1/?cluster=mycluster&selection=mydoctype&fieldSet=mydoctype:[document]

To visit larger sets, use continuation.


condition Requires that this condition is true, otherwise a 40x is returned. See conditional updates for details.
create true | false. If set to true, updates will create new document if not existing. See updates for more details on updates to non existing documents.
selection Select a set of documents using visiting - details in document selector language.
continuation When visiting, a continuation token is returned if the result set is large. Use the token in the continuation parameter (otherwise equal) to get the next set of documents.
route This is the route for document operations. Default value is 'default'. See routes

Positive integer. Best effort attempt to not respond to the client before wantedDocumentCount number of documents can be returned. Response may still contain fewer documents if there are not enough matching documents left to visit in the cluster, or if the visiting times out. This parameter is intended for the case when you have relatively few documents in your cluster and where each GET operation otherwise would only return a handful of documents.

Note that the maximum value of wantedDocumentCount is bounded by an implementation-specific limit to prevent excessive resource usage. If you have many documents (on the order of tens of millions) in your cluster, you do not need to set this value.

fieldSet A field set string constraining the set of document fields returned from the backend. Default value is <visited document type>:[document], which returns all fields.
concurrency Positive integer. Sends the given number of visitors in parallel to the backend, improving throughput at the cost of resource usage. Caution: given a concurrency parameter of N, the worst case for memory used while processing the request grows linearly with N. This is because the container currently buffers all response data in memory before sending them to the client, and all sent visitors must complete before the response can be sent. Default is 1.
cluster String - name of content cluster to GET