Operations

A deployed Vespa application is a self-contained highly available, distributed stateful system. Operating these at scale is difficult, so Vespa automates this to the extent possible in the deployment environment it is running.


Deployment environmentAutomated operationsSuitable for
Vespa self-managed/open source Application deployment (single application, single instance), application change (except rolling restarts), data redistribution, failover Development
Vespa Kubernetes Operator Application deployment (single application, single instance), application change, data redistribution, failover, node provisioning, failed node replacement, node type change, autoscaling, endpoint routing, encryption Production in environments outside hyperscalers
Vespa Cloud Application deployment (multiple applications, instances, regions, clouds), application change, data redistribution, failover, node provisioning, failed node replacement, node type change, autoscaling, endpoint routing, encryption, Vespa platform and OS upgrades, continuous deployment pipeline with verification, metrics and management console Development, production on hyperscalers (including in customer accounts and VPCs).

Vespa is designed to enable applications to evolve in production. This includes these aspects:

  • Application package changes are managed by Vespa's built-in control plane to be carried out without impacting queries or writes. If a change can not be made without impacting queries or writes, it is rejected on deployment (and will require a validation override to be allowed).
  • The operations supported by Vespa are those that can be scaled to hundreds of nodes, billions of documents and hundreds of thousands of queries per second. If you can run it on a single machine, you can scale it.
  • The hardware resources available in a cluster can be changed both up and down. Redistribution will happen automatically in the background, without limited resource usage to avoid impacting queries and writes.
  • When possible (on Vespa Cloud), new revisions of applications are deployed in test zones where they can be verified by application-supplied functional tests before being allowed to progress to production.

Performance and scaling

Content clusters in Vespa can be scaled to any amount of content by adding more nodes (horizontal scaling). Data will redistribute automatically, and there's no need for manual tuning of the process. To scale to large amounts of queries, content clusters can also be scaled by adding multiple groups of nodes (vertical scaling). Each group contains a single copy of the corpus and container clusters will automatically load balance over groups.

A Vespa application can consist of any number of stateless and stateful clusters. On larger applications it can be beneficial to split different functions into separate clusters that can be optimized separately. For example, having one stateless container cluster for feeding and another for querying, or using different content clusters for different data schemas.

Read more in elasticity and the performance guide.


Next: What's more