The Vespa Operator automatically provisions Kubernetes Service resources to enable external access for feeding and querying data.
This behavior is controlled by the VespaSet Custom Resource configuration. Refer to the VespaSet Reference to
configure the VespaSet.
Load balancers are provisioned exclusively for Container clusters. Content clusters communicate internally
and do not require external load balancing services. The type of service provisioned is determined by the
spec.ingress.endpointType field in the VespaSet.
The operator supports four endpoint types to cover different infrastructure requirements.
| Endpoint Type | Kubernetes Service Type | Use Case |
|---|---|---|
LOAD_BALANCER |
LoadBalancer |
Provision the cloud-native (AWS, GCP, Azure) load-balancer. |
NODE_PORT |
NodePort |
Expose a static port across every worker node, allowing external traffic to access the cluster from any node's IP. |
CLUSTER_IP |
ClusterIP |
Each Container Pod will expose an internal IP address. Should not be used for production use-cases. |
NONE |
N/A | An external access layer will not be provisioned. Custom networking setups (Istio, Ingress Controllers) where no automatic service is desired. |
This is the recommended configuration for production deployments on cloud providers (EKS, GKE, AKS).
The operator creates a standard Kubernetes LoadBalancer service, triggering the cloud provider to provision
a managed load balancer (e.g., AWS NLB).
Configuration:
ingress: endpointType: LOAD_BALANCER
On AWS, the ConfigServer automatically applies the annotation service.beta.kubernetes.io/aws-load-balancer-internal: "true" to all Container pods.
This provisions an internal Network Load Balancer (NLB) accessible only within the VPC where the EKS cluster nodes reside.
The NODE_PORT type exposes the Vespa container cluster on a specific port (range 30000-32767) across all Kubernetes worker nodes.
Configuration:
ingress: endpointType: NODE_PORT
When this option is set, Kubernetes opens a static port on every worker node. External traffic can reach the application via <NodeIP>:<NodePort>.
Note that unlike LOAD_BALANCER, this does not provide health checks at the entry point level. If a worker node with a connection crashes, the connection will simply
time out or fail. This additionally requires all worker nodes to expose an External IP.
To use the NODE_PORT service, find the assigned port.
$ kubectl get service lb-default -n $NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE lb-default NodePort 10.100.150.25 <none> 80:31942/TCP 5m
Get the list of nodes and look for their External IP addresses.
$ kubectl get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION ip-192-168-3-50.us-east-2.compute.internal Ready <none> 10d v1.27.3-eks-a5565ad 192.168.3.50 18.221.100.45 Amazon Linux 2 5.10.184-175.731.amzn2.x86_64 ip-192-168-3-51.us-east-2.compute.internal Ready <none> 10d v1.27.3-eks-a5565ad 192.168.3.51 3.142.200.10 Amazon Linux 2 5.10.184-175.731.amzn2.x86_64
Choose any External IP and combine the IP and port to access the service.
$ curl http://18.221.100.45:31942/state/v1/health
{
"time" : 1769981985754,
"status" : {
"code" : "up"
},
"metrics" : {
"snapshot" : {
"from" : 1.769981924895E9,
"to" : 1.769981984895E9
},
"values" : [ {
"name" : "requestsPerSecond",
"values" : {
"count" : 19,
"rate" : 0.31666666666666665
}
}, {
"name" : "latencySeconds",
"values" : {
"average" : 0.009578947368421053,
"sum" : 0.182,
"count" : 19,
"last" : 0.003,
"max" : 0.057,
"min" : 0.0,
"rate" : 0.31666666666666665
}
} ]
}
}
This type restricts access to within the Kubernetes cluster. It provides a stable internal IP and DNS name
(e.g., lb-default.vespa.svc.cluster.local) but assigns no external IP.
Configuration:
ingress: endpointType: CLUSTER_IP
The CLUSTER_IP service is ideal for architectures where the clients (e.g., front-end applications or ingestion services) run inside the same Kubernetes cluster as Vespa.
This option disables automatic Service provisioning. Use this if you intend to manually define Ingress resources,
use a Service Mesh (like Istio or Linkerd), or have complex networking requirements not covered by the standard types.
Configuration:
ingress: endpointType: NONE
To ensure zero-downtime deployments, the ConfigServer manages traffic routing dynamically using Kubernetes labels.
The created Services use the selector vespa.ai/tenant-lb: backend. When the Pod is provisioned, these labels
are automatically attached.
During a rolling upgrade, the label is removed from the terminating Pod(s) before they are shut down. This provides a window for the remaining traffic to drain before the Pod is upgraded.
Note: The Service exposes port 80 (plaintext) and 443 (TLS) externally, mapping them to the container's port 4443.