• [+] expand all

Batch delete

Options for batch deleting documents:

  1. Find documents using a query, delete, repeat. Pseudocode:
    while True; do
       query and read document ids, if empty exit
       delete document ids using /document/v1
       wait a sec # optional, add wait to reduce load while deleting
    
  2. Like 1, but use the Vespa feed client. Instead of deleting one-by-one, stream remove operations to the API (write a Java program for this), or append to a file and use the binary:
    $ vespa-feed-client --file deletes.json --endpoint my-endpoint
  3. Use a document selection to expire documents. This deletes all documents not matching the expression. It is possible to use parent documents and imported fields for expiry of a document set. The content node will iterate over the corpus and delete documents (that are later compacted out):
    <documents garbage-collection="true">
        <document type="mytype" selection="mytype.version > 4" />
    </documents>
  4. Use /document/v1 to delete documents identified by a document selection - example dropping all documents from the my_doctype schema:
    $ curl -X DELETE --cert data-plane-public-cert.pem --key data-plane-private-key.pem \
      "$ENDPOINT/document/v1/my_namespace/my_doctype/docid?selection=true&cluster=my_cluster"