This is the /document/v1 API reference documentation. Use this API for synchronous Document operations to a Vespa endpoint - refer to reads and writes for other options.
The document/v1 API guide has examples and use cases.
Some examples use number and group document id modifiers. These are special cases that only work as expected for document types with mode=streaming or mode=store-only. Do not use group or number modifiers with regular indexed mode document types.
To enable the API, add document-api
in the serving container cluster -
services.xml:
<services>
<container>
<document-api/>
HTTP request | document/v1 operation | Description |
---|---|---|
GET |
Get a document by ID or Visit a set of documents by selection. |
|
Get |
Get a document:
/document/v1/<namespace>/<document-type>/docid/<document-id> /document/v1/<namespace>/<document-type>/number/<numeric-group-id>/<document-id> /document/v1/<namespace>/<document-type>/group/<text-group-id>/<document-id>Optional parameters: |
|
Visit |
Iterate over and get all documents, or a selection of documents, in chunks, using continuation tokens to track progress. Visits are a linear scan over the documents in the cluster. /document/v1/It is possible to specify namespace and document type with the visit path: /document/v1/<namespace>/<document-type>/docidDocuments can be grouped to limit accesses to a subset. A group is defined by a numeric ID or string — see id scheme. /document/v1/<namespace>/<document-type>/group/<group> /document/v1/<namespace>/<document-type>/number/<number>Mandatory parameters:
|
|
POST |
Put a given document, by ID, or Copy a set of documents by selection from one content cluster to another. | |
Put |
Write the document contained in the request body in JSON format.
/document/v1/<namespace>/<document-type>/docid/<document-id> /document/v1/<namespace>/<document-type>/group/<group> /document/v1/<namespace>/<document-type>/number/<number>Optional parameters:
|
|
Copy |
Write documents visited in source cluster to the destinationCluster in the same application. A selection is mandatory — typically the document type. Supported paths (see visit above for semantics): /document/v1/ /document/v1/<namespace>/<document-type>/docid/ /document/v1/<namespace>/<document-type>/group/<group> /document/v1/<namespace>/<document-type>/number/<number>Mandatory parameters: Optional parameters: |
|
PUT |
Update a document with the given partial update, by ID, or Update where the given selection is true. |
|
Update |
Update a document with the partial update contained in the request body in the
document JSON format.
/document/v1/<namespace>/<document-type>/docid/<document-id>Optional parameters:
|
|
Update where |
Update visited documents in cluster with the partial update contained in the request body in the document JSON format. Supported paths (see visit above for semantics): /document/v1/<namespace>/<document-type>/docid/ /document/v1/<namespace>/<document-type>/group/<group> /document/v1/<namespace>/<document-type>/number/<number>Mandatory parameters: Optional parameters:
|
|
DELETE |
Remove a document, by ID, or Remove where the given selection is true. |
|
Remove |
Remove a document.
/document/v1/<namespace>/<document-type>/docid/<document-id>Optional parameters: |
|
Delete where |
Delete visited documents from cluster. Supported paths (see visit above for semantics): /document/v1/ /document/v1/<namespace>/<document-type>/docid/ /document/v1/<namespace>/<document-type>/group/<group> /document/v1/<namespace>/<document-type>/number/<number>Mandatory parameters: Optional parameters:
|
Parameter | Type | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
bucketSpace | String |
Specify the bucket space to visit.
Document types marked as | ||||||||||
cluster | String |
Name of content cluster to GET from, or visit. |
||||||||||
concurrency | Integer |
Sends the given number of visitors in parallel to the backend,
improving throughput at the cost of resource usage.
Default is 1.
When
Important:
Given a concurrency parameter of N,
the worst case for memory used while processing the request grows linearly with N,
unless stream mode is turned on.
This is because the container currently buffers all response data in memory before sending them to the client,
and all sent visitors must complete before the response can be sent.
|
||||||||||
condition | String |
For test-and-set. Run a document operation conditionally — if the condition fails, a 412 Precondition Failed is returned. See example. |
||||||||||
continuation | String |
When visiting, a continuation token is returned as the |
||||||||||
create | Boolean |
If |
||||||||||
destinationCluster | String |
Name of content cluster to copy to, during a copy visit. |
||||||||||
dryRun | Boolean |
Used by the vespa-feed-client
using |
||||||||||
fieldSet | String |
A field set string
with the set of document fields to fetch from the backend.
Default is the special |
||||||||||
route | String |
The route for single document operations, and for operations generated
by copy, update or
deletion visits. Default value is |
||||||||||
selection | String |
Select only a subset of documents when visiting — details in document selector language. |
||||||||||
sliceId | Integer |
The slice number of the visit represented by this HTTP request. This number must be non-negative
and less than the number of slices specified for the visit -
e.g., if the number of slices is 10,
Note:
If the number of distribution bits change during a sliced visit,
the results are undefined.
Thankfully, this is a very rare occurrence and is only triggered when adding content nodes.
|
||||||||||
slices | Integer |
Split the document corpus into this number of independent slices. This lets multiple, concurrent series of HTTP requests advance the same logical visit independently, by specifying a different sliceId for each. |
||||||||||
stream | Boolean |
Whether to stream the HTTP response, allowing data to flow as soon as documents arrive from the backend. This obsoletes the wantedDocumentCount parameter. The HTTP status code will always be 200 if the visit is successfully initiated. Default value is false. |
||||||||||
format.tensors | String |
Controls how tensors are rendered in the result.
|
||||||||||
timeChunk | String |
Target time to spend on one chunk of a copy, update or remove visit; with optional ks, s, ms or µs unit. Default value is 60. |
||||||||||
timeout | String | Request timeout in seconds, or with optional ks, s, ms or µs unit. Default value is 180s. |
||||||||||
tracelevel | Integer |
Number in the range [0,9], where higher gives more details. The trace dumps which nodes and chains the document operation has touched. See routes. |
||||||||||
wantedDocumentCount | Integer |
Best effort attempt to not respond to the client before
The maximum value of |
||||||||||
fromTimestamp | Integer |
Filters the returned document set to only include documents that were last modified at a time point equal to or higher to the specified value, in microseconds from UTC epoch. Default value is 0 (include all documents). |
||||||||||
toTimestamp | Integer |
Filters the returned document set to only include documents that were last modified
at a time point lower than the specified value, in microseconds from UTC epoch.
Default value is 0 (sentinel value; include all documents). If non-zero, must be
greater than, or equal to, |
||||||||||
includeRemoves | Boolean |
Include recently removed document IDs, along with the set of returned documents.
By default, only documents currently present in the corpus are returned in the
|
POST and PUT requests must include a body for single document operations, and PUT for update visits. A field has a value for a POST and an update operation object for PUT. Documents and operations use the document JSON format. The document fields must match the schema:
Values for id
/ put
/ update
in the request body are silently dropped.
The ID is generated from the request path, regardless of request body data - example:
This makes it easier to generate a feed file that can be used for both the vespa-feed-client and this API.
Non-exhaustive list of status codes:
Code | Description |
---|---|
200 | OK. Attempts to remove or update a non-existent document also yield this status code (see 412 below). |
400 | Bad request. Returned for undefined document types + other request errors.
See 13465
for defined document types not assigned to a content cluster when using PUT.
Inspect message for details. |
404 | Not found; the document was not found. This is only used when getting documents. |
412 | condition is not met.
Inspect message for details. This is also the result when
a condition if specified, but the document does not exist. |
429 | Too many requests; the document API has too many inflight feed operations, retry later. |
500 | Server error; an unspecified error occurred when processing the request/response. |
503 | Service unavailable; the document API was unable to produce a response at this time. |
504 | Gateway timeout; the document API failed to respond within the given (or default 180s) timeout. |
507 | Insufficient storage; the content cluster is out of memory or disk space. |
Header | Values | Description |
---|---|---|
X-Vespa-Ignored-Fields | true |
Will be present and set to 'true' only when a put or update contains one or more fields which were ignored since they are not present in the document type. Such operations will be applied exactly as if they did not contain the field operations referencing non-existing fields. References to non-existing fields in field paths are not detected. |
Responses are in JSON format, with the following fields:
Field | Description |
---|---|
pathId | Request URL path — always included. |
message | An error message — included for all failed requests. |
id | Document ID — always included for single document operations, including Get. |
fields | The requested document fields — included for successful Get operations. |
documents[] | Array of documents in a visit result — each document has the id and fields. |
documentCount | Number of visited and selected documents.
If includeRemoves is true , this also includes
the number of returned removes (tombstones). |
continuation | Token to be used to get the next chunk of the corpus - see continuation. |
GET can include a fields
object if a document was found in a GET request
A GET visit result can include an array of documents
plus a continuation:
A continuation indicates the client should make further requests to get more data, while lack of a continuation indicates an error occurred, and that visiting should cease, or that there are no more documents.
A message
can be returned for failed operations: