This guide covers the aspects of accessing documents in Vespa.
Documents are stored in content clusters.
Writes (PUT, UPDATE, DELETE) and reads (GET) pass through a container cluster.
Find a more detailed flow at the end of this article.
Vespa's indexing structures are built for high-rate, memory-only operations for field updates.
Refer to the feed sizing guide for write performance,
in particular partial updates for in-memory-only writes.
Vespa supports parent/child for de-normalized data.
This can be used to simplify the code to update application data,
as one write will update all children documents.
Applications can add custom feed document processors
and multiple container clusters - see indexing for details.
Writes in Vespa are consistent, but Vespa will prioritize availability over consistency when
there is a conflict.
See the elasticity documentation
and the Vespa consistency model.
It is recommended to use the same client instance for updating a given document when possible -
for data consistency, but also
performance (see concurrent mutations).
Read more on write operation ordering.
For performance, group field updates to the same document into
one update operation.
Applications can auto-expire documents.
This feature also blocks PUTs to documents that are already expired -
see indexing and
This is a common problem when feeding test data with timestamps,
and the writes a silently dropped.
Also see troubleshooting.
Get a document by ID.
Write a document by ID - a document is overwritten if a document with the same document ID exists.
Remove a document by ID.
If the document to be removed is not found, it is not considered a failure.
Read more about data-retention.
Also see batch deletes.
Also referred to as partial updates,
as it updates some/all fields of a document by ID.
If the document to update is not found, it is not considered a failure.
Update supports create if nonexistent.
Updates can have conditions
for test-and-set use cases.
All data structures (attribute,
index and summary) are updatable.
Note that only assign and remove are idempotent -
message re-sending can apply updates more than once.
Use conditional writes for stronger consistency.
API and utilities
Documents are created using JSON or in
|API / util||Description|
- Java library and command line client for feeding document operations using /document/v1/ over HTTP/2
- Asynchronous, high-performance Java implementation, with retries and dynamic throttling
- Simpler alternative to the Vespa HTTP client (below)
- Supports a JSON array of feed operations, as well as JSONL: one operation JSON per line
|Java Document API
||Provides direct read-and write access to Vespa documents using Vespa's internal communication layer.
Use this when accessing documents from Java components in Vespa
such as searchers and
||Utility to feed data with high performance.
vespa-get gets single documents,
vespa-visit gets multiple.
Use the vespa-feed-client
or /document/v1/ API directly to read and write documents:
Alternatively, use vespa-feeder to feed files
or the Java Document API.
and/or document processing
is a chain of processors that manipulate documents before they are stored.
Document processors can be user defined.
When using indexed search, the final step in the chain prepares documents for indexing.
The Document API forwards requests to distributors on content nodes.
For more information, read about content nodes
and the search core.