• [+] expand all

/document/v1 API guide

This is the /document/v1 API guide. Refer to the document/v1 API reference.

Request examples

GET
Get
$ curl http://hostname:8080/document/v1/my_namespace/my_document-type/docid/1
Get a document in a group:
$ curl http://hostname:8080/document/v1/namespace/music/number/23/some_key
$ curl http://hostname:8080/document/v1/namespace/music/group/groupname/some_key
Visit visit all documents:
$ curl http://hostname:8080/document/v1/namespace/music/docid
Visit all documents using continuation:
$ curl http://hostname:8080/document/v1/namespace/music/docid?continuation=AAAAEAAAAAAAAAM3AAAAAAAAAzYAAAAAAAEAAAAAAAFAAAAAAABswAAAAAAAAAAA
Visit using a selection:
$ curl http://hostname:8080/document/v1/namespace/music/docid?selection=music.genre=='blues'

Note that visit with selection is a linear scan over all the music documents in the example above. The selection expression is evaluated for all documents. Trying to speed up selection visits using multiple http requests with different selection strings does hence not improve throughput. Use concurrency to increase the parallelism and throughput.

Visit all documents for a group:

$ curl http://hostname:8080/document/v1/namespace/music/number/23/
$ curl http://hostname:8080/document/v1/namespace/music/group/groupname/
Visit documents across all non-global document types and namespaces stored in content cluster mycluster:
$ curl http://hostname:8080/document/v1/?cluster=mycluster
Visit documents across all global document types and namespaces stored in content cluster mycluster:
$ curl http://hostname:8080/document/v1/?cluster=mycluster&bucketSpace=global
POST Post data in the document JSON format.
$ curl -X POST -H "Content-Type:application/json" --data-binary @document-1.json http://hostname:8080/document/v1/namespace/music/docid/1
{
    "fields": {
        "songs": "Knockin on Heaven's Door; Mr. Tambourine Man",
        "title": "Best of Bob Dylan",
        "url": "http://music.yahoo.com/bobdylan/BestOf"
    }
}
PUT
$ curl -X PUT -H "Content-Type:application/json" --data-binary @update.json http://hostname:8080/document/v1/namespace/music/docid/1
{
    "fields": {
        "title": {
            "assign": "New title"
        }
    }
}
DELETE Delete document with ID 1:
$ curl -X DELETE http://hostname:8080/document/v1/namespace/music/docid/1
Delete all documents in my_doctype schema:
$ curl -X DELETE --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
  "$ENDPOINT/document/v1/my_namespace/my_doctype/docid?selection=true&cluster=my_cluster"

ID examples

  • Uniform distribution: id:mynamespace:music::mydocid-123
  • Data access is grouped, e.g. personal data (each user has a numeric user id): id:mynamespace:music:n=12345:mydocid-123
  • Using a string identifier to group data: id:mynamespace:music:g=mymusicsite.com:mydocid-123

Conditional writes

A test-and-set condition can be added to Put, Remove and Update operations. Example:

{
    "update": "id:mynamespace:music::a-head-full-of-dreams",
    "condition": "music.artist==\"Elvis\"",
    "fields": {
        "artist": {
            "assign": "Coldplay"
        }
    }
}

If the condition is not met, a 412 Precondition Failed is returned:

$ curl -X PUT -H "Content-Type:application/json" \
  --data-binary @src/test/resources/A-Head-Full-of-Dreams-update.json \
  http://localhost:8080/document/v1/mynamespace/music/docid/a-head-full-of-dreams

{
    "pathId": "/document/v1/mynamespace/music/docid/a-head-full-of-dreams",
    "id": "id:mynamespace:music::a-head-full-of-dreams",
    "message": "[UNKNOWN(251013) @ tcp/vespa-container:19112/default]:
      ReturnCode(TEST_AND_SET_CONDITION_FAILED,
      Condition did not match document nodeIndex=0 bucket=20000000000000de)"
}

Also see the condition reference.

Create if nonexistent

Updates to nonexistent documents are supported using create. An empty document is created on the content nodes, before the update is applied. This simplifies client code in the case of multiple writers. Example:

$ cat src/test/resources/A-Head-Full-of-Dreams-update.json
{
    "fields": { "artist": { "assign": "Coldplay" } }
}

$ curl -X PUT -H "Content-Type:application/json" \
  --data-binary @src/test/resources/A-Head-Full-of-Dreams-update.json \
  'http://localhost:8080/document/v1/mynamespace/music/docid/a-head-full-of-things?&create=true'

create can be used in combination with a condition. If the document does not exist, the condition will be ignored and a new document with the update applied is automatically created. Otherwise, the condition must match for the update to take place.

Data dump

To iterate over documents, use visiting — sample output:

{
    "pathId": "/document/v1/namespace/doc/docid",
    "documents": [
        {
            "id": "id:namespace:doc::id-1",
            "fields": {
                "title": "Document title 1",
                ...
            }
        },
        ...
    ],
    "continuation": "AAAAEAAAAAAAAAM3AAAAAAAAAzYAAAAAAAEAAAAAAAFAAAAAAABswAAAAAAAAAAA"
}
Note the continuation token — use this in the next request for more data. Sample script dumping all data using jq for JSON parsing:

#!/bin/bash

set -x

ENDPOINT="https://endpoint.vespa.oath.cloud"
NAMESPACE=open
DOCTYPE=doc
CLUSTER=documentation
CERT=data-plane-public-cert.pem
KEY=data-plane-private-key.pem

continuation=""
idx=0

while
  ((idx+=1))
  echo "$continuation"
  printf -v out "%05g" $idx
  filename=${NAMESPACE}-${DOCTYPE}-${out}.data.gz
  echo "Fetching data..."
  token=$(curl -s --cert ${CERT} --key ${KEY} \
          "${ENDPOINT}/document/v1/${NAMESPACE}/${DOCTYPE}/docid?wantedDocumentCount=20&concurrency=4&cluster=${CLUSTER}&${continuation}" \
          | tee >(gzip > ${filename}) | jq -re .continuation)
do
  continuation="continuation=${token}"
done

Using fieldsets

When visiting across all document types, some internal document fields (e.g. Geo fields) set by Vespa may be returned as part of the results. To avoid this, limit visiting to just one document type using selection and explicitly filter these internal fields away using fieldSet:

curl http://hostname:8080/document/v1/?cluster=mycluster&selection=mydoctype&fieldSet=mydoctype:%5Bdocument%5D