FAQ - frequently asked questions

Refer to Vespa Support for more support options.


Does Vespa support a flexible ranking score?

Ranking is maybe the best Vespa feature - we like to think of it as scalable, online computation. A rank profile is where the application’s logic is implemented, supporting simple types like double and complex types like tensor. Supply ranking data in queries in query features (e.g. different weights per customer), or look up in a Searcher. Typically a document (e.g. product) “feature vector”/”weights” will be compared to a user-specific vector (tensor).

Where would customer specific weightings be stored?

Vespa doesn’t have specific support for storing customer data as such. You can store this data as a separate document type in Vespa and look it up before passing the query, or store this customer meta-data as part of the other meta-data for the customer (i.e. login information) and pass it along the query when you send it to the backend. Find an example on how to look up data in album-recommendation-docproc. </p>

How to create a tensor on the fly in the ranking expression?

Create a tensor in the ranking function from arrays or weighted sets using tensorFrom... functions - see document features.


What limits apply to json document size?

There is no hard limit. Vespa requires a document to be able to load into memory in serialized form. Vespa is not optimized for huge documents.

Can a document have lists (key value pairs)?

E.g. a product is offered in a list of stores with a quantity per store. Use multivalue fields (array of struct) or parent child. Which one to chose depends on use case, see discussion the the latter link.

Does a whole document need to be updated and re-indexed?

E.g, price and quantity available per store may change often vs the actual product attributes. Vespa supports partial updates of documents. Also, the parent/child feature is implemented to support use-cases where child elements are updated frequently, while a more limited set of parent elements are updated less frequently.

What ACID guarantees if any does Vespa provide for single writes / updates / deletes vs batch operations etc?

See Vespa Consistency Model. Vespa is not transactional in the traditional sense, it doesn’t have strict ACID guarantees. Vespa is designed for high performance use-cases with eventual consistency as an acceptable (and to some extent configurable) trade-off.


Is hierarchical facets supported?

Facets is called grouping in Vespa. Groups can be multi-level.

Is filters supported?

Add filters to the query using YQL using boolean, numeric and text matching.

How to query for similar items?

One way is to describe items using tensors and query for the nearest neighbor - using full precision or approximate (ANN) - the latter is used when the set is too large for an exact calculation. Apply filters to the query to limit the neighbor candidate set. Using dot products or weak and are alternatives.

Stop-word support?

Vespa does not have a stop-word concept inherently. See the sample app for how to use filter terms.

Does Vespa support addition of flexible NLP processing for documents and search queries?

E.g. integrating NER, word sense disambiguation, specific intent detection. Vespa supports these things well:

Does Vespa support customization of the inverted index?

E.g. instead of using terms or n-grams as the unit, we might use terms with specific word senses (e.g. bark (dog bark) vs. bark (tree bark), or BCG (company) vs. BCG (vaccine name). Creating a new index format means changing the core. However, for the examples above, one just need control over the tokens which are indexed (and queried). That is easily done in some Java code. The simplest way to do this is to plug in a custom tokenizer. That gets called from the query parser and bundled linguistics processing Searchers as well as the Document Processor creating the annotations that are consumed by the indexing operation. Since all that is Searchers and Docprocs which you can replace and/or add custom components before and after, you can also take full control over these things without modifying the platform itself.

Programming Vespa

is Python plugins supported / is there a scripting language?

Plugins have to run in the JVM - jython might be an alternative, however Vespa Team has no experience with it. Vespa does not have a language like painless - it is more flexible to write application logic in a JVM-supported language, using Searchers and Document Processors.


Vespa has a near real-time indexing core with typically sub-second latencies from ingest to indexed. This depends on the use-case, available resources and how the system is tuned. Some more examples and thoughts can be found in the scaling guide.

Is there a batch ingestion mode, what limits apply?

Vespa does not have a concept of “batch ingestion” as it contradicts many of the core features that are the strengths of Vespa, including serving elasticity and sub-second indexing latency. That said, we have numerous use-cases in production that do high throughput updates to large parts of the (sometimes entire) document set. In cases where feed throughput is more important than indexing latency, you can tune this to meet your requirements. Some of this is detailed in the feed sizing guide.

Can the index support up to 512GB index size in memory?

Yes. The content node is implemented in C++ and not memory constrained other than what the operating system does.


How fast can nodes be added and removed from a running cluster?

Elasticity is a core Vespa strength - easily add and remove nodes with minimal (if any) serving impact. The exact time needed depends on how much data will need to be migrated in the background for the system to converge to ideal data distribution.

Should Vespa API search calls be load balanced or does Vespa do this automatically?

You will need to load balance incoming requests between the nodes running the stateless Java container cluster(s). This can typically be done using a simple network load balancer available in most cloud services. This is included when using Vespa Cloud, with an already load balanced HTTPS endpoint - both locally within the region and globally across regions.

Supporting index partitions

Search sizing is the intro to this. Topology matters, and this is much used in the high-volume Vespa applications to optimise latency vs. cost

Can a running cluster be upgraded with zero downtime?

With Vespa Cloud, we do automated background upgrades daily without noticeable serving impact. If you host Vespa yourself, you can do this, but need to implement the orchestration logic necessary to handle this. The high level procedure is found in live-upgrade.

Can Vespa be deployed multi-region?

Vespa Cloud has integrated support - query a global endpoint. Writes will have to go to each zone. There is no auto-sync between zones.

Can Vespa serve an Offline index?

Building indexes offline requires the partition layout to be known in the offline system, which is in conflict with elasticity and auto-recovery (where nodes can come and go without service impact). It is also at odds with realtime writes. For these reasons, it is not recommended, and not supported.