Comparison to Elasticsearch

In this document we will take a look at the main differences between Elasticsearch and Vespa.

The different use cases

With focus on big data serving, Vespa is optimized for low millisec response, high write and query load, Machine Learning integration and automated high availability operations. Vespa support true realtime writes, and true partial updates, and is also easy to operate at large scale. Here can you read about Vespa’s features.

Vespa is the only open source platform optimized for such big data serving.

Analytics vs. Big Data Serving

To decide whether Elasticsearch or Vespa is the right choice for a use case, consider if it needs to be optimized for analytics or serving.

AnalyticsBig data serving
Response time in low secondsResponse time in low milliseconds
Low query rateHigh query rate
Time series, append onlyRandom writes
Down time, data loss acceptableHigh availability, no data loss, online redistribution
Massive data sets (trillion of docs) are cheapMassive data sets are more expensive
Analytics GUI integrationMachine learning integration

Scaling

The fundamental unit of scale in Elasticsearch is the shard. Sharding allows scale out by partitioning the data into smaller chunks that can be distributed across a cluster of nodes. The challenge is to figure out the right number of shards, because you only get to make the decision once per index. And it impacts both performance, storage and scale, since queries are sent to all shards. So how many shards are the right number of shards?

In Vespa you do not have to worry about the number of shards and re-sharding. Vespa will take care of that. You have a cluster of nodes, and you can add or remove nodes without resharding, which means no downtime for resharding.

Vespa allows applications to grow (and shrink) their hardware while serving queries and accepting writes as normal. Data is automatically redistributed in the background using the minimal amount of data movement required to reestablish an even data distribution. No restarts or other operations are needed, just change the hardware listed in the configuration and redeploy the application.

For a detailed guide on how to set up a multinode Vespa system see Multi-Node Quick Start.

Other relevant sources: