/document/v1/ API reference

This is the reference documentation for the /document/v1/ API. It is not a high throughput API, but designed for ease of use. Read the writing to vespa introduction.

To enable it, add document-api in services.xml:

<services>
  <container>
    <document-api/>

Document Format

Documents and operations use the document format. The document fields must match the search definition. A request looks like:

[PUT/POST/DELETE/GET]: [host]:[port]/document/v1/{namespace}/{document-type}/docid/[{user-specified}]
Documents can be grouped:
[PUT/POST/DELETE/GET]: [host]:[port]/document/v1/{namespace}/{document-type}/[group|number]/{group name or number value}/[{user-specified}]
Example, inserting a document:
POST: [host]:[port]/document/v1/music/music/docid/123

{
    "fields": {
        "songs": "Knockin on Heaven's Door; Mr. Tambourine Man",
        "title": "Best of Bob Dylan",
        "url": "http://music.yahoo.com/bobdylan/BestOf"
    }
}

Operations

Put Insert a new document, using HTTP POST. Optional parameters: condition and route.
$ curl -X POST -H "Content-Type:application/json" --data-binary @document-1.json http://hostname:8080/document/v1/music/music/docid/1
Update Update a document, using HTTP PUT. Optional parameters: condition, create, and route.
$ curl -X PUT -H "Content-Type:application/json" --data-binary @update.json http://hostname:8080/document/v1/music/music/docid/1
Delete Delete a document, using HTTP DELETE. Optional parameters: condition and route.
$ curl -X DELETE http://hostname:8080/document/v1/music/music/docid/1
Get Get one or more documents, using HTTP GET:
$ curl http://hostname:8080/document/v1/music/music/docid/1
If docid is not specified, selection is supported and a set of documents is returned using visiting:
$ curl http://hostname:8080/document/v1/music/music/docid?selection=music.genre=='blues'
Visit all documents for a group:
$ curl http://hostname:8080/document/v1/music/some-type-with-group/group/yellow/
A specific document in a group:
$ curl http://hostname:8080/document/v1/music/some-type-with-number/number/23/some_key
Visit documents across all document types and namespaces stored in content cluster mycluster:
$ curl http://hostname:8080/document/v1/?cluster=mycluster

A Document API request can only retrieve data from one cluster, so cluster must be specified for requests at the root /document/v1/ level. This is required even if you just have a single content cluster in your application.

Note: when visiting across all document types, some internal document fields set by Vespa may be returned as part of the results. To avoid this, limit visiting to just one document type using selection and explicitly filter these internal fields away using fieldSet:

$ curl http://hostname:8080/document/v1/?cluster=mycluster&selection=mydoctype&fieldSet=mydoctype:[document]
To visit larger sets, use continuation.

Parameters

condition Requires that this condition is true, otherwise a 40x is returned. See conditional updates for details.
create true | false. If set to true, updates will create new document if not existing. See updates for more details on updates to non existing documents.
selection Select a set of documents using visiting - details in document selector language.
continuation When visiting, a continuation token is returned if the result set is large. Use the token in the continuation parameter (otherwise equal) to get the next set of documents.
route This is the route for document operations. Default value is 'default'. See routes
wantedDocumentCount

Positive integer. Best effort attempt to not respond to the client before wantedDocumentCount number of documents can be returned. Response may still contain fewer documents if there are not enough matching documents left to visit in the cluster, or if the visiting times out. This parameter is intended for the case when you have relatively few documents in your cluster and where each GET operation otherwise would only return a handful of documents.

Note that the maximum value of wantedDocumentCount is bounded by an implementation-specific limit to prevent excessive resource usage. If the cluster has many documents (on the order of tens of millions), there is no need to set this value.

fieldSet A field set string constraining the set of document fields returned from the backend. Default value is <visited document type>:[document], which returns all fields.
concurrency Positive integer. Sends the given number of visitors in parallel to the backend, improving throughput at the cost of resource usage. Caution: given a concurrency parameter of N, the worst case for memory used while processing the request grows linearly with N. This is because the container currently buffers all response data in memory before sending them to the client, and all sent visitors must complete before the response can be sent. Default is 1.
cluster String - name of content cluster to GET
bucketSpace String - explicitly specifies the bucket space to visit. Document types marked as global exist in a separate bucket space from non-global document types. When visiting a particular document type the correct bucket space is automatically deduced based on the provided type name. When visiting at a root /document/v1/ level this information is not available, so only the non-global ("default") bucket space is visited by default. Specify global here to visit global documents instead. Supported values: default (for non-global documents) and global.