These steps walk through deploying Vespa using the official Helm chart.
We recommend the following tools for a smooth deployment.
These instructions assume that your kubeconfig is pointing to an active Kubernetes cluster. Refer to the Getting Started guide to create a Kubernetes cluster. For instructions on deploying Vespa
locally, refer to the Deploy Vespa Locally guide.
Vespa on Kubernetes uses a Custom Resource Definition (CRD) Custom Resource Definition called a VespaSet. Users intending to manage
the CRD definition themselves should apply it to the cluster before installation.
The RBAC permissions that are needed to run Vespa are listed on the Permissions page. The Helm Chart will automatically apply a default set of RBAC resources onto the cluster.
Note: Vespa on Kubernetes is an enterprise feature. You will need access to the images below to successfully deploy Vespa. Contact us through our support portal to receive an authentication ID and token. For production use, we recommend mirroring these images into your own registry or a well-known internal repository appropriate for your infrastructure.
VESPA_IMAGE=images.ves.pa/kubernetes/vespaVESPA_OPERATOR_IMAGE=images.ves.pa/kubernetes/operatorHELM_CHART_REF=oci://images.ves.pa/helm/vespa-operator
We will use this naming convention throughout this guide. The tag for all three images conform to the Vespa Version release semantics.
We recommend using the latest Vespa release as the default. We will refer to it as VESPA_VERSION.
The Vespa Operator and all Vespa components are local to a namespace. We will refer to the namespace as NAMESPACE.
Authenticate to the Helm Chart OCI registry. The credentials will be provided by our support team.
$ helm registry login images.ves.pa -u $USER -p $TOKEN
Now, install the Helm Chart onto the target namespace. This will deploy the Vespa Operator and apply
the VespaSet resource. Set image.repository to the VESPA_OPERATOR_IMAGE image provided by our support team.
The image.tag refers to the VESPA_VERSION to deploy.
$ helm install vespa-operator $HELM_CHART_REF --namespace $NAMESPACE --create-namespace --set image.repository=$VESPA_OPERATOR_IMAGE --set image.tag=$VESPA_VERSION
The CRD definition lifecycle can be managed separately. However, the CRD specification must be manually applied to the Kubernetes cluster before installing the Helm Chart. Our support team can provide this specification if necessary.
Use the --skip-crds option to skip CRD definition installation.
$ kubectl apply vespasets.k8s.ai.vespa-v1.yaml $ helm install vespa-operator $HELM_CHART_REF --namespace $NAMESPACE --create-namespace --skip-crds --set image.repository $VESPA_OPERATOR_IMAGE --set image.tag $VESPA_VERSION
Ensure that the Deployment was successfully created with the Vespa Operator Pod. It can be done
using the following check.
$ kubectl wait --for=condition=available deployment/vespa-operator --timeout=120s -n $NAMESPACE \ && kubectl get pods -l app=vespa-operator -o wide -n $NAMESPACE
To deploy a VespaSet to allow for dev environment clusters, refer to the example in the
Setup Dev Environment documentation.
A VespaSet represents a quorum of ConfigServers that manage Vespa applications. Several examples of
VespaSet specifications are provided in the Helm Chart samples directory.
A sample VespaSet for Amazon Elastic Kubernetes Service (EKS) is shown below.
# vespaset sample for EKS
$ cat > vespaset.yaml <<EOF
apiVersion: k8s.ai.vespa/v1
kind: VespaSet
metadata:
name: vespaset-sample
namespace: ${NAMESPACE}
spec:
version: "${VESPA_VERSION}"
configServer:
image: "${VESPA_OPERATOR_IMAGE}"
storageClass: "gp3"
generateRbac: false
application:
image: "${VESPA_IMAGE}"
storageClass: "gp3"
ingress:
endpointType: "LOAD_BALANCER"
EOF
$ kubectl apply -f vespaset.yaml
An example for a local deployment on MiniKube would be as follows.
# vespaset sample for MiniKube
$ cat > vespaset.yaml <<EOF
apiVersion: k8s.ai.vespa/v1
kind: VespaSet
metadata:
name: vespaset-sample
namespace: ${NAMESPACE}
spec:
version: "${VESPA_VERSION}"
configServer:
image: "${VESPA_OPERATOR_IMAGE}"
storageClass: "local-storage"
generateRbac: false
application:
image: "${VESPA_IMAGE}"
storageClass: "local-storage"
ingress:
endpointType: "NONE"
EOF
$ kubectl apply -f vespaset.yaml
Once the VespaSet is applied, the operator will automatically detect the newly created VespaSet resource and create a quorum of
ConfigServers.
The ConfigServers will then bootstrap themselves and ensure a quorum. This process takes roughly a minute. The bootstrap process is complete once
the VespaSet shows the status as RUNNING for all ConfigServer Pods. For example:
$ kubectl describe vespaset vespaset-sample -n $NAMESPACE
Name: vespaset-sample
Namespace: $NAMESPACE
API Version: k8s.ai.vespa/v1
Kind: VespaSet
Spec:
Application:
Image: 192.168.49.2:5000/localhost/vespaai/kubernetes
Storage Class: gp3
Config Server:
Generate Rbac: false
Image: 192.168.49.2:5000/localhost/vespaai/kubernetes
Storage Class: gp3
Ingress:
Endpoint Type: NONE
Version: 8.643.16
Status:
Bootstrap Status:
Pods:
cfg-0:
Last Updated: 2026-01-29T21:38:45Z
Message: Pod is running
Phase: RUNNING
Converged Version: 8.643.16
cfg-1:
Last Updated: 2026-01-29T21:38:09Z
Message: Pod is running
Phase: RUNNING
Converged Version: 8.643.16
cfg-2:
Last Updated: 2026-01-29T21:36:32Z
Message: Pod is running
Phase: RUNNING
Converged Version: 8.643.16
Last Transition Time: 2026-01-29T21:33:55Z
Message: All configservers running
Phase: RUNNING
Events: <none>
A Vespa application can be deployed through the ConfigServers once bootstrap has completed. Refer to the Vespa Sample Applications to get started. In the following example, we will use the Album Recommendation sample.
Set up the Vespa CLI to download the Album Recommendation sample to a directory.
$ vespa clone album-recommendation myapp && cd myapp
Modify the application package with resource specifications to ensure the correct Pod count, as shown below:
<?xml version="1.0" encoding="utf-8" ?>
<services version="1.0" xmlns:deploy="vespa" xmlns:preprocess="properties">
<container id="default" version="1.0">
<document-api/>
<search/>
<nodes count="2">
<resources vcpu="2" memory="2Gb" disk="20Gb" />
</nodes>
</container>
<content id="music" version="1.0">
<min-redundancy>2</min-redundancy>
<documents>
<document type="music" mode="index" />
</documents>
<nodes count="2">
<resources vcpu="2" memory="2Gb" disk="20Gb" />
</nodes>
</content>
</services>
Enable port-forwarding from the ConfigServer's ingress port 19071 to your local port 19071.
$ vespa config set target local $ kubectl -n $NAMESPACE port-forward pod/cfg-0 19071:19071
Deploy and activate the application.
$ vespa prepare --target local $ while not vespa --target local activate
The ConfigServers will create the Container, Content, and Cluster-Controller Pods as specified in the application package. The deployment
is considered complete once all Pods show the phase RUNNING in the VespaSet status.
Port-forwarding provides a simple way to access the ingress ports locally. For other ingress options, see the Configuring the External Access Layer section.
Feed documents to the Dataplane entrypoint by port-forwarding the Dataplane ingress port and the ConfigServer ingress port.
# Ensure the port-forward to 19071 is still active $ kubectl -n $NAMESPACE port-forward pod/cfg-0 19071:19071 # Port-forward to the dataplane ingress port $ kubectl -n $NAMESPACE port-forward pod/default-100 8080:8080
Then, use the Vespa CLI to feed a document. In our same Album Recommendation sample:
$ vespa feed dataset/A-Head-Full-of-Dreams.json