Comparison to Elasticsearch
In this document we will take a look at the main differences between Elasticsearch and Vespa.
The different use cases
With focus on big data serving, Vespa is optimized for low millisec response, high write and query load, Machine Learning integration and automated high availability operations. Vespa support true realtime writes, and true partial updates, and is also easy to operate at large scale. Here can you read about Vespa’s features.
Vespa is the only open source platform optimized for such big data serving.
Analytics vs. Big Data Serving
To decide whether Elasticsearch or Vespa is the right choice for a use case, consider if it needs to be optimized for analytics or serving.
|Analytics||Big data serving|
|Response time in low seconds||Response time in low milliseconds|
|Low query rate||High query rate|
|Time series, append only||Random writes|
|Down time, data loss acceptable||High availability, no data loss, online redistribution|
|Massive data sets (trillion of docs) are cheap||Massive data sets are more expensive|
|Analytics GUI integration||Machine learning integration|
The fundamental unit of scale in Elasticsearch is the shard. Sharding allows scale out by partitioning the data into smaller chunks that can be distributed across a cluster of nodes. The challenge is to figure out the right number of shards, because you only get to make the decision once per index. And it impacts both performance, storage and scale, since queries are sent to all shards. So how many shards are the right number of shards?
In Vespa you do not have to worry about the number of shards and re-sharding. Vespa will take care of that. You have a cluster of nodes, and you can add or remove nodes without resharding, which means no downtime for resharding.
Vespa allows applications to grow (and shrink) their hardware while serving queries and accepting writes as normal. Data is automatically redistributed in the background using the minimal amount of data movement required to reestablish an even data distribution. No restarts or other operations are needed, just change the hardware listed in the configuration and redeploy the application.
For a detailed guide on how to set up a multinode Vespa system see Multi-Node Quick Start.
Other relevant sources: