# Using Kubernetes with Vespa

 

This article outlines how to run Vespa using Kubernetes. Find a quickstart for running Vespa in a single pod in [singlenode quickstart with minikube](#singlenode-quickstart-with-minikube).

Setting up a multi-pod Vespa cluster is a bit more complicated, and requires knowledge about how Vespa configures its services. Use the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke) sample application as a basis for configuration.

 ![Vespa overview illustration](/assets/img/vespa-overview.svg)
- A Vespa cluster is made of one or more config servers in a config server cluster. This cluster keeps configuration for the services running in the service pods. The config server cluster pods should hence be started first. 
- Config servers use Apache Zookeeper for shared state. The config servers will not set their _/state/v1/health_ to UP before Zookeeper quorum is reached. This means that all config server pods must be running before quorum is reached, and one cannot use a _readinessProbe_ probe for the config servers for a staggered start. 
- See a practical example at [config server cluster startup](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke#config-server-cluster-startup) - once completed it should look like:
```
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
vespa-configserver-0 1/1 Running 0 2m45s
vespa-configserver-1 1/1 Running 0 107s
vespa-configserver-2 1/1 Running 0 62s
```
- Once the config server cluster is started successfully, the [application package](../../basics/applications.html) can be deployed, and the pods for the services nodes started. The application package maps services to pods (nodes), so this must be deployed successfully before the services in the pods can start. It does not matter whether one deploys the application package before or after starting the service pods, as the pods will idle, waiting for configuration. 
- [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke) starts the pods first, see [Vespa startup](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke#vespa-startup). As the application package is not yet deployed, the service inside the pods is not started (as it is not configured). The Vespa infrastructure is started, however, see [config sentinel](config-sentinel.html) - so the pod is started with the config-proxy waiting for services config at this point. 
- The [cluster startup](config-sentinel.html#cluster-startup) feature is good to know. This is a setting to not start a service before enough services can run - see the _Connectivity check_ log messages. 
- Deploy the application package. At this point, the pods will know which service to run, and start a container or content node service. Shortly after, the _/state/v1/health_ endpoint is enabled on the pods. 
- Note that ports are allocated dynamically, but the defaults will get you started - see the illustration with [services and ports](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA#get-started) for _/state/v1/health_: 
  - Config server: 19071
  - Container node: 8080
  - Content node: 19107

The list above is an overview of the config server -\> application package -\> service _/state/v1/health_ dependency chain. This sequence of steps must be considered when building the Kubernetes cluster configuration.

A good next step is running the [multinode-HA](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke) for Kubernetes - there you will also find useful [troubleshooting](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/multinode-HA/gke#misc--troubleshooting) tools.

## Singlenode quickstart with minikube

This section describes how to install and run Vespa on a single machine using Kubernetes (K8s). Also see [Vespa example on GKE](https://github.com/vespa-engine/sample-apps/tree/master/examples/operations/basic-search-on-gke).

**Prerequisites:**

- Linux, macOS or Windows 10 Pro on x86\_64 or arm64, with Podman or [Docker](https://docs.docker.com/engine/install/) installed. See [Docker Containers](/en/operations/self-managed/docker-containers.html) for system limits and other settings. For CPUs older than Haswell (2013), see [CPU Support](/en/cpu-support.html)
- Memory: Minimum 5 GB RAM dedicated to Docker/Podman. [Memory recommendations](/en/operations/self-managed/node-setup.html#memory-settings). 
- Disk: Avoid `NO_SPACE` - the vespaengine/vespa container image + headroom for data requires disk space. [Read more](/en/writing/feed-block.html). 
- [Homebrew](https://brew.sh/) to install the [Vespa CLI](/en/clients/vespa-cli.html), or download the Vespa CLI from [Github releases](https://github.com/vespa-engine/vespa/releases). 
- [Git](https://git-scm.com/downloads).
- [Minikube](https://kubernetes.io/docs/tasks/tools/).

1. **Validate environment:**

Refer to [Docker memory](docker-containers.html#memory) for details and troubleshooting:

```
$ docker info | grep "Total Memory"
or
$ podman info | grep "memTotal"
```

2. **Start Kubernetes cluster with minikube:**

```
$ minikube start --driver docker --memory 4096
```

3. **Clone the [Vespa sample apps](https://github.com/vespa-engine/sample-apps):**

```
$ git clone --depth 1 https://github.com/vespa-engine/sample-apps.git
$ export VESPA_SAMPLE_APPS=`pwd`/sample-apps
```

4. **Create Kubernetes configuration files:**

```
$ cat << EOF > service.yml
apiVersion: v1
kind: Service
metadata:
  name: vespa
  labels:
    app: vespa
spec:
  selector:
    app: vespa
  type: NodePort
  ports:
  - name: container
    port: 8080
    targetPort: 8080
    protocol: TCP
  - name: config
    port: 19071
    targetPort: 19071
    protocol: TCP
EOF
```

```
$ cat << EOF > statefulset.yml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: vespa
  labels:
    app: vespa
spec:
  replicas: 1
  serviceName: vespa
  selector:
    matchLabels:
      app: vespa
  template:
    metadata:
      labels:
        app: vespa
    spec:
      containers:
      - name: vespa
        image: vespaengine/vespa
        imagePullPolicy: Always
        env:
        - name: VESPA_CONFIGSERVERS
          value: vespa-0.vespa.default.svc.cluster.local
        securityContext:
          runAsUser: 1000
        ports:
        - containerPort: 8080
          protocol: TCP
        readinessProbe:
          httpGet:
            path: /state/v1/health
            port: 19071
            scheme: HTTP
EOF
```

5. **Start the service:**

```
$ kubectl apply -f service.yml -f statefulset.yml
```

6. **Wait for the service to enter a running state:**

```
$ kubectl get pods --watch
```

 Wait for STATUS Running:
```
NAME READY STATUS RESTARTS AGE
    vespa-0 0/1 ContainerCreating 0 8s
    vespa-0 0/1 Running 0 2m4s
```
7. **Start port forwarding to pods:**

```
$ kubectl port-forward vespa-0 19071 8080 &
```

8. **Wait for configserver start - wait for 200 OK:**

```
$ curl -s --head http://localhost:19071/state/v1/health
```

9. **Deploy and activate the application package:**

```
$ vespa deploy ${VESPA_SAMPLE_APPS}/album-recommendation
```

10. **Ensure the application is active - wait for 200 OK:**

This normally takes a minute or so:

```
$ curl -s --head http://localhost:8080/state/v1/health
```

11. **Feed documents:**

```
$ vespa feed sample-apps/album-recommendation/ext/documents.jsonl
```

12. **Make a query:**

13. **Run a document get request:**

```
$ vespa document get id:mynamespace:music::love-is-here-to-stay
```

14. **Clean up:**

At any point during the procedure, dump logs for troubleshooting:

```
$ kubectl logs vespa-0
```

 Copyright © 2025 - [Cookie Preferences](#)