Enterprise Not open source: This functionality is only available commercially.

Lifecycle Operations for Vespa on Kubernetes

The ConfigServer and Vespa Application Pods have built-in resilience and recovery capabilities; they are automatically recovered during failures and gracefully shut down during maintenance or scaling operations to preserve data integrity.

Automatic Recovery

Vespa relies on standard Kubernetes controllers to detect and restart crashed Pods. If a container exits unexpectedly (e.g., OOMKilled or application crash), the kubelet will automatically restart it.

However, the ConfigServers track the health history of every Pod. To prevent a "crash loop" from causing cascading failures or constantly churning resources, the system implements a strict throttling mechanism. The ConfigServers allow a maximum of 2 involuntary Pod disruptions per 24-hour period for a given Vespa Application. If this limit is exceeded, the ConfigServer stops automatically failing these Pods and will require human intervention to investigate the root cause.

Graceful Shutdown

To prevent query failures or data loss during termination, a PreStop Hook is placed on every ConfigServer and Vespa Application Pod. During a voluntary disruption, this hook ensures that existing traffic is drained and that data is flushed before killing the Pod.

Two types of disruptions exist in Kubernetes:

Type Scenario Behavior
Voluntary Disruption Scaling down, rolling upgrades, or node maintenance. The preStop hook detects a voluntary disruption, stops the Vespa Container cluster from accepting new traffic, flushes in-memory data to disk for Content clusters, and ensures a clean exit before the Pod is deleted.
Involuntary Disruption Node hardware failure, kernel panic, or eviction. Kubernetes initiates the termination. The preStop hook attempts to run to flush data and close connections. However, if the Pod is lost abruptly. the hook cannot run, and recovery relies on Vespa's data replication.

Pod Disruption Budget

Defining a PodDisruptionBudget (PBD) is not supported for Vespa on Kubernetes. The ConfigServers will override any PBD with its own orchestration policy.

Application Pod Resources

For Vespa Application Pods, the resources for each Pod, the number of Pods in a Vespa cluster, and the group configuration can be updated through the <services> element in the application package. Refer to the specification for more details.

ConfigServer Pod Resources

ConfigServer Pod resources can be configured by overriding the vespa container's resource specification via the PodTemplate in the VespaSet. The Config Server deduces its heap size from the Pod cgroup limits, which are derived from the requests and limits set on the Pod. Setting requests and limits to the same value is recommended to ensure the heap size is deduced correctly.

Horizontally scaling the replica count for ConfigServer Pods is not supported.

apiVersion: k8s.ai.vespa/v1
kind: VespaSet
metadata:
  name: sample-vespaset
spec:
  configServer:
    image: "$VESPA_IMAGE"
    storageClass: "gp3"
    podTemplate:
      spec:
        containers:
          - name: vespa
            resources:
              requests:
                cpu: "4"
                memory: "8Gi"
              limits:
                cpu: "4"
                memory: "8Gi"

Autoscaling

Vespa on Kubernetes provides autoscaling through ranges specified in the resource elements in the application package. Refer to the Autoscaling guide for more details.