Vespa basics
Deploy an application
Vespa applications
Schemas
Writing
Querying
Ranking
Operations
What's more
Learn more
Vespa overview
Getting help from LLMs
Features
Tutorials and use cases
Frequently asked questions
Glossary
Releases
Tenants, apps and instances
Migrating to Vespa Cloud
Migrating from ElasticSearch
About this documentation
Contributing to Vespa
Applications and components
Developer guide
IDE support
Deployment
Containers
Components
Searchers
Document processors
Request handlers
Result renderers
Dependency injection
Configuring components
Chaining
Inspecting structured data in a Searcher
Developing web services
Unit testing
System testing
The config system
Request-response processing
Bundles
Using ZooKeeper
Http servers and filters
Using pluggable frameworks
Java config API
Schemas and documents
Documents
Inheritance in schemas
Concrete document types
Parent-child relationships
Structs
Exposing schema information
Reading and writing
Reads and writes
/document/v1
Indexing
Index bootstrap
Visiting
Document API
Partial updates
Batch delete
Feed block
Document routing
Indexing paged vectors
Querying
The query api
The YQL query language
Grouping and aggregation
Federation
Query profiles
An intro to vector search
Nearest neighbor search
Approximate nearest neighbor search
Nearest neighbor search guide
Text matching
Searching multivalue fields
Geo search
Predicate fields
Document summaries
Result diversity
Page templates
Ranking and inference
Ranking introduction
Ranking expressions and features
Multivalue query operators
Tensor user guide
Tensor examples
Phased ranking
Using TensorFlow models
Using ONNX models
Using XGBoost models
Using LightGBM models
Wand: Accelerated OR search
The BM25 rank feature
The nativeRank rank feature
Cross-encoder transformer ranking
Searcher re-ranking
Significance model
Stateless model evaluation
RAG and embedding
RAG in Vespa
Working with chunks
Embedding
Binarizing vectors
LLMs in Vespa
Using local LLMs
Using external LLMs
Document enrichment with LLMs
Model hub
Linguistics and text processing
Linguistics
Lucene linguistics
Query rewriting
Troubleshooting character encoding
Content and elasticity
Content clusters
Content nodes and states
Elasticity
Document attributes
Consistency Model
Distribution algorithm
Buckets
Performance
Performance overview
Practical performance guide
Serving sizing guide
Feed sizing guide
Node resources
Sizing examples
Topology and resizing
Streaming search
Benchmarking
Benchmarking using Vespa Cloud
Memory visualizer
Profiling
Container tuning
Rate-limiting queries
Graceful degradation
Caches
HTTP performance testing
HTTP/2
Feature tuning
Valgrind
Operations
Environments
Zones
Production deployment
Automated deployments
Autoscaling
Enclave
AWS getting started
AWS architecture
GCP getting started
GCP architecture
Log archive
Operations
Reindexing
Data management and backup
Cloning applications and data
Monitoring
Metrics
Notifications
Deployment patterns
Private endpoints
Endpoint routing
Access logging
Archive guide
Archive Guide AWS
Archive Guide GCP
Deleting applications
Admin procedures
Multinode Systems
Files, Processes, Ports, Environment
Node Setup
Using Kubernetes
Build and install
Monitoring
Content node recovery
Configuration Servers
Live Vespa upgrade procedure
Config Sentinel
Config Proxy
Docker Containers
Docker Containers GPU setup
CPU Support
Service Location Broker
Change from attribute to index procedure
Container
Monitoring
Security
Security overview
Security Guide
Secret Store
Cloudflare Workers
Security Whitepaper
Securing a Vespa installation
mTLS
Clients
Command line client (Vespa CLI)
Python client (pyVespa)
Java feed client
HTTP best practices
Modules
Multi-currency filtering
Reference
vespa
vespa activate
vespa auth
vespa clone
vespa config
vespa curl
vespa deploy
vespa destroy
vespa document
vespa feed
vespa fetch
vespa inspect
vespa log
vespa prepare
vespa prod
vespa query
vespa status
vespa test
vespa version
vespa visit
Application packages
Schemas
services.xml
services.xml - admin
services.xml - container
services.xml - content
services.xml - docproc
services.xml - http
services.xml - processing
services.xml - search
deployment.xml
Deployment variants
hosts.xml
validation-overrides.xml
System test reference
System test reference (Java)
Indexing language
Chunking
Embedding
Components
Custom configuration files
Configuration file format
mTLS
Tools
Vespa Command-line Tools
Health checks
APIs overview
The query API
/document/v1 API
/state/v1 API
/application/v2 API (deployment)
/application/v2/tenant API
/config/v2 API
/cluster/v2 API
/metrics/v1 API
/metrics/v2 API
/prometheus/v1 API
The YQL query language
The simple query language
Select
Grouping
Sorting
Query profiles
Semantic rules
The default result format
Ranking expressions
Tensor evaluation
Rank features
nativeRank
String segment match
Rank feature configuration
Rank types
Model files
The document JSON format
Document field path syntax
Document selector language
Metrics
Default metric set
Vespa metric set
Container metrics
Distributor metrics
Search node metrics
Storage metrics
Configserver metrics
Logd metrics
Node Admin metrics
Slobrok metrics
Cluster controller metrics
Sentinel metrics
Metric units
Vespa 7
Vespa 8
Vespa 9 (upcoming)
Slack
Blog
Twitter
GitHub
Issues
Slack
Blog
Twitter
GitHub
Issues
TOC
Vespa basics
Deploy an application
Vespa applications
Schemas
Writing
Querying
Ranking
Operations
What's more
Learn more
Vespa overview
Getting help from LLMs
Features
Tutorials and use cases
Frequently asked questions
Glossary
Releases
Tenants, apps and instances
Migrating to Vespa Cloud
Migrating from ElasticSearch
About this documentation
Contributing to Vespa
Applications and components
Developer guide
IDE support
Deployment
Containers
Components
Searchers
Document processors
Request handlers
Result renderers
Dependency injection
Configuring components
Chaining
Inspecting structured data in a Searcher
Developing web services
Unit testing
System testing
The config system
Request-response processing
Bundles
Using ZooKeeper
Http servers and filters
Using pluggable frameworks
Java config API
Schemas and documents
Documents
Inheritance in schemas
Concrete document types
Parent-child relationships
Structs
Exposing schema information
Reading and writing
Reads and writes
/document/v1
Indexing
Index bootstrap
Visiting
Document API
Partial updates
Batch delete
Feed block
Document routing
Indexing paged vectors
Querying
The query api
The YQL query language
Grouping and aggregation
Federation
Query profiles
An intro to vector search
Nearest neighbor search
Approximate nearest neighbor search
Nearest neighbor search guide
Text matching
Searching multivalue fields
Geo search
Predicate fields
Document summaries
Result diversity
Page templates
Ranking and inference
Ranking introduction
Ranking expressions and features
Multivalue query operators
Tensor user guide
Tensor examples
Phased ranking
Using TensorFlow models
Using ONNX models
Using XGBoost models
Using LightGBM models
Wand: Accelerated OR search
The BM25 rank feature
The nativeRank rank feature
Cross-encoder transformer ranking
Searcher re-ranking
Significance model
Stateless model evaluation
RAG and embedding
RAG in Vespa
Working with chunks
Embedding
Binarizing vectors
LLMs in Vespa
Using local LLMs
Using external LLMs
Document enrichment with LLMs
Model hub
Linguistics and text processing
Linguistics
Lucene linguistics
Query rewriting
Troubleshooting character encoding
Content and elasticity
Content clusters
Content nodes and states
Elasticity
Document attributes
Consistency Model
Distribution algorithm
Buckets
Performance
Performance overview
Practical performance guide
Serving sizing guide
Feed sizing guide
Node resources
Sizing examples
Topology and resizing
Streaming search
Benchmarking
Benchmarking using Vespa Cloud
Memory visualizer
Profiling
Container tuning
Rate-limiting queries
Graceful degradation
Caches
HTTP performance testing
HTTP/2
Feature tuning
Valgrind
Operations
Environments
Zones
Production deployment
Automated deployments
Autoscaling
Enclave: Bring your own cloud
Enclave
AWS getting started
AWS architecture
GCP getting started
GCP architecture
Log archive
Operations
Reindexing
Data management and backup
Cloning applications and data
Monitoring
Metrics
Notifications
Deployment patterns
Private endpoints
Endpoint routing
Access logging
Artefact archive
Archive guide
Archive Guide AWS
Archive Guide GCP
Deleting applications
Self-managed
Admin procedures
Multinode Systems
Files, Processes, Ports, Environment
Node Setup
Using Kubernetes
Build and install
Monitoring
Content node recovery
Configuration Servers
Live Vespa upgrade procedure
Config Sentinel
Config Proxy
Docker Containers
Docker Containers GPU setup
CPU Support
Service Location Broker
Change from attribute to index procedure
Container
Monitoring
Security
Security overview
Security Guide
Secret Store
Cloudflare Workers
Security Whitepaper
Securing a Vespa installation
mTLS
Clients
Command line client (Vespa CLI)
Python client (pyVespa)
Java feed client
HTTP best practices
Modules
E-commerce
Multi-currency filtering
Reference
Vespa CLI
vespa
vespa activate
vespa auth
vespa clone
vespa config
vespa curl
vespa deploy
vespa destroy
vespa document
vespa feed
vespa fetch
vespa inspect
vespa log
vespa prepare
vespa prod
vespa query
vespa status
vespa test
vespa version
vespa visit
Application packages
Schemas
services.xml
services.xml
services.xml - admin
services.xml - container
services.xml - content
services.xml - docproc
services.xml - http
services.xml - processing
services.xml - search
deployment.xml
Deployment variants
hosts.xml
validation-overrides.xml
System test reference
System test reference (Java)
Indexing language
Chunking
Embedding
Components
Custom configuration files
Configuration file format
mTLS
Tools
Vespa Command-line Tools
Health checks
APIs
APIs overview
The query API
/document/v1 API
/state/v1 API
/application/v2 API (deployment)
/application/v2/tenant API
/config/v2 API
/cluster/v2 API
/metrics/v1 API
/metrics/v2 API
/prometheus/v1 API
Queries and results
The YQL query language
The simple query language
Select
Grouping
Sorting
Query profiles
Semantic rules
The default result format
Ranking and inference
Ranking expressions
Tensor evaluation
Rank features
nativeRank
String segment match
Rank feature configuration
Rank types
Model files
Document API
The document JSON format
Document field path syntax
Document selector language
Metrics
Metrics
Default metric set
Vespa metric set
Container metrics
Distributor metrics
Search node metrics
Storage metrics
Configserver metrics
Logd metrics
Node Admin metrics
Slobrok metrics
Cluster controller metrics
Sentinel metrics
Metric units
Release notes
Vespa 7
Vespa 8
Vespa 9 (upcoming)
Vespa API and interfaces
Deployment and configuration
Deploy API
: Deploy
application packages
to configure a Vespa application
Config API
: Get and Set configuration
Tenant API
: Configure multiple tenants in the config servers
Document API
Reads and writes
: APIs and binaries to read and update documents
/document/v1/
: REST API for operations based on document ID (get, put, remove, update)
Feeding API
: High performance feeding API, the recommended API for feeding data
JSON feed format
: The Vespa Document format
Vespa Java Document API
Query and grouping
Query API
,
Query API reference
Query Language
,
Query Language reference
,
Simple Query Language reference
,
Predicate fields
Vespa Query Profiles
Grouping API
,
Grouping API reference
Processing
Vespa Processing
: Request-Response processing
Vespa Document Processing
: Feed processing
Request processing
Searcher API
Federation API
Web service API
Result processing
Custom renderer API
Status and state
Health and Metric APIs
/cluster/v2 API
On this page:
Deployment and configuration
Document API
Query and grouping
Processing
Request processing
Result processing
Status and state