# Upgrade Vespa on Kubernetes

[](/en/operations/kubernetes/operations/upgrades.html.md "View as Markdown") 

Vespa on Kubernetes supports zero-downtime rolling upgrades. An upgrade involves upgrading the `vespa-operator` via the Helm chart and the ConfigServer and Application (Container and Content) Pods through the `VespaSet` resource.

We do not support version drift between the `vespa-operator` and the `VespaSet`. Accordingly, upgrades should be planned so that all components are updated together. To ensure availability, they should be performed in the order as shown in this guide.

## Update the CRD

Some upgrades may introduce changes to the `VespaSet` CRD definition. These changes should be applied to the cluster before performing the upgrade. As a rule of thumb, we recommend executing this before every upgrade procedure.

Helm does not manage the lifecycle of the CRD after it is installed (see [the official documentation](https://helm.sh/docs/chart_best_practices/custom_resource_definitions/?utm_source=chatgpt.com)). As a result, CRD updates must be handled manually. Given the official Helm Chart for Vespa on Kubernetes, this can be performed by extracting the CRD definition from the OCI package and applying it directly using `kubectl`.

```
$ helm show crds $HELM_CHART_REF --version $VESPA_VERSION > vespaset-crd.yaml
$ kubectl apply -f vespaset-crd.yaml
```

## Upgrade the Vespa Operator

The operator can be upgraded through helm by running `helm upgrade` with the new `VESPA_VERSION`. Replace `$NAMESPACE` with the namespace where Vespa is installed. Refer to [Factory](https://factory.vespa.ai/) for the latest `VESPA_VERSION`. Note that upgrading the operator does not affect the ConfigServer and Application Pods. Their upgrade will be performed in a subsequent step.

```
$ helm upgrade vespa-operator vespa/vespa-operator \
  --version $OPERATOR_VERSION \
  --namespace $NAMESPACE \
  --reuse-values
```

Wait for the operator to finish rolling out before proceeding.

```
$ kubectl rollout status deployment/vespa-operator -n $NAMESPACE
```

## Upgrade the VespaSet

To upgrade the ConfigServer and application Pods, patch the `spec.version` field in the `VespaSet` resource. Ensure that the target image is available and accessible on the Kubernetes Node at `VESPA_OPERATOR_IMAGE:VESPA_VERSION` and `VESPA_IMAGE:VESPA_VERSION` before proceeding. For example:

```
$ cat > vespaset.yaml <<EOF
apiVersion: k8s.ai.vespa/v1
kind: VespaSet
metadata:
  name: vespaset-sample
  namespace: ${NAMESPACE}
spec:**version: 8.566.7 # Specify the version to upgrade to.**configServer:
    image: "${VESPA_OPERATOR_IMAGE}"
    storageClass: "gp3"
    generateRbac: false

  application:
    image: "${VESPA_IMAGE}"
    storageClass: "gp3"

  ingress:
    endpointType: "NONE"
EOF

$ kubectl apply -f vespaset.yaml
```

The ConfigServer Pods will detect a change to the `VespaSet` resource and orchestrate the upgrade procedure to themselves and the Application Pods.

## Upgrade Sequence

The upgrade always proceeds in two phases: ConfigServer Pods are upgraded first, followed by Application Pods. This ordering is required because the Config Servers must be running the new version before they can safely orchestrate Application Pods onto it.

Additionally, the base template for creating a Vespa Pod, whether it be a ConfigServer or Application Pod, could have been changed during the upgrade. As such, the ConfigServer Pods should ensure that they are basing off the latest template, rather than a stale one, to prevent needlessly recreating the Pods with the latest template post-upgrade.

During the upgrade procedure, each Pod is upgraded one at a time. This process is sequential. For each Pod, the operator:

1. Drains the Pod of traffic and flushes any in-memory state to disk.
2. Deletes the Pod, and recreates it with the new image
3. Waits for the Pod to become healthy and report its `Converged Version` as a status on the VespaSet
4. Proceeds to the next Pod.

The cluster remains operational throughout this procedure. The remaining ConfigServer Pods continue serving configuration to Application Pods while each node is upgraded in turn, and the Dataplane layer will continue to serve traffic as normal. For Content Pods, the operator waits for data redistribution to complete before moving to the next Pod, ensuring no data loss during the rollout.

To ensure zero downtime for any applications, ingress should be properly configured so that traffic is correctly load balanced across the Dataplane layer, allowing requests to be seamlessly routed away from Pods undergoing upgrades. Refer to the [Ingress](../configuration/ingress.html) page for more details.

## Monitoring the Upgrade

Throughout the upgrade, each Pod's status is reflected in the `VespaSet` status. A Pod that is actively being upgraded reports its phase as `UPGRADING`. A Pod that has successfully completed the upgrade reports its `Converged Version` as the new version.

In the example below, the Config Server Pods have all converged to `8.577`, while the Application Pod `default-100` is currently upgrading and has not yet converged from `8.576`.

```
$ kubectl describe vespaset vespaset-sample -n $NAMESPACE
Name: vespaset-sample
Namespace: $NAMESPACE
Labels: <none>
Annotations: <none>
API Version: k8s.ai.vespa/v1
Kind: VespaSet
Metadata:
  Creation Timestamp: 2026-01-29T21:32:27Z
  Finalizers:
    vespasets.k8s.ai.vespa/finalizer
  Generation: 1
  Resource Version: 121822902
  UID: a70f56e9-6625-4011-acd7-9f7cad29dbc2
Spec:
  Application:
    Image: $VESPA_IMAGE
    Storage Class: gp3
  Config Server:
    Generate Rbac: false
    Image: $VESPA_IMAGE
    Storage Class: gp3
  Ingress:
    Endpoint Type: LOAD_BALANCER
  Version: 8.577
Status:
  Bootstrap Status:
    Pods:
      cfg-1:
        Last Updated: 2026-01-29T21:38:45Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.577
      cfg-2:
        Last Updated: 2026-01-29T21:38:09Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.577
      cfg-3:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.577**default-100: Last Updated: 2026-01-29T21:38:45Z Message: Pod is upgrading Phase: UPGRADING Converged Version: 8.576**default-101:
        Last Updated: 2026-01-29T21:38:09Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
      documentation-102:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
      documentation-103:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
      cluster-controller-104:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
      cluster-controller-105:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
      cluster-controller-106:
        Last Updated: 2026-01-29T21:36:32Z
        Message: Pod is running
        Phase: RUNNING
        Converged Version: 8.576
  Last Transition Time: 2026-01-29T21:33:55Z
  Message: All configservers running
  Phase: RUNNING
Events: <none>
```

The upgrade is complete when every Pod's `Converged Version` matches the new version and all phases report `RUNNING`.

## Debugging Upgrade Failures

If a Pod fails to converge to the target version — for example, due to an image pull failure, a crash loop, or a failed health check, the ConfigServer will continuously retry the upgrade for that Pod until it either succeeds or an administrator intervenes.

In this scenario, the administrator can diagnose the issue by inspecting the ConfigServer logs or the events of the failing Pod in the current upgrade phase. Once the issue is resolved, the ConfigServer will automatically retry the upgrade for that Pod and proceed with the remaining nodes.

For example, suppose the Pod `search-106` is failing to upgrade.

```
$ kubectl get logs cfg-1 -n $NAMESPACE
$ kubectl get logs cfg-2 -n $NAMESPACE
$ kubectl get logs cfg-3 -n $NAMESPACE
$ kubectl describe pod search-106 -n $NAMESPACE
```

This design prevents a bad upgrade from cascading to the rest of the Pods. Since the ConfigServer refuses to advance past a Pod that has not converged, the remaining Pods stay on the previous known-good version while the administrator investigates.

 Copyright © 2026 - [Cookie Preferences](#)

### On this page:

- [Upgrade Vespa on Kubernetes](#page-title)
- [Update the CRD](#)
- [Upgrade the Vespa Operator](#)
- [Upgrade the VespaSet](#)
- [Upgrade Sequence](#)
- [Monitoring the Upgrade](#)
- [Debugging Upgrade Failures](#)

