Node and Network setup - Concepts and Requirements

Vespa is composed of services that communicate and interact with each other. These services can be partitioned onto any amount of actual hardware for scaling, or they can all coexist on a single environment for development. To achive this flexibility, some requirements must be met for the environment where the services will run.

The node concept

A node in this context is the environment where some Vespa services are running. This can be an actual machine like a server in a datacenter, or a laptop for developement and testing of Vespa configuration. It can also be a Virtual Machine or a Docker container, so you can run multiple nodes on a single piece of hardware.

The different Vespa services that run on nodes will mostly communicate with each other via the network. This means that all nodes must have an IP address and have network connectivity to all other nodes. Both IPv4 and IPv6 protocols are supported. Note that the same framework is used even when running the entire Vespa stack on a single node.

The host name

In order to find the IP address of a node and connect to it, the node must have a host name that identifies it and which maps to its IP address. Actual machines on a network will usually have a Fully Qualified Domain Name (FQDN) in DNS which should be used as the host name for this purpose.

Note that it is a requirement that the host name (configured in hosts.xml) can be used to look up the IP address of the node. The configuration server use this host name to construct URLs that will be used to open network connections to Vespa services running on that node. If your nodes use IP addresses which don't have DNS names, you must have all those IP addresses with corresponding host names in /etc/hosts on all nodes in your Vespa installation. We recommend that you use names that can be used as FQDNs also in this case, in case you want to move to using a DNS server instead of publishing the /etc/hosts file.

Identifying the node for configuration

When Vespa services are started on a node, the node must identify itself to the configuration system to get the correct configuration (including which services to run). This requires a unique identifier for the node in the config server. Since it's already a requirement that the node has a host name that the config server must know, Vespa uses the same host name when a node identifies itself to get its configuration.

This means that the node must know its own host name (FQDN), and be in agreement with the config server about what exactly the host name is.

Finding your own host name

As discussed above, a Vespa node needs to know its own host name when it starts. Usually this is achieved by just running the hostname command. If you can arrange so hostname is set to the FQDN of the node, then everything should Just Work (TM).

But sometimes this doesn't work properly, either because that name can't be used to find an IP address which works for connecting to services running on the node, or it's just that the name doesn't agree with what the config server thinks the node's host name is. In this case you can override by setting the VESPA_HOSTNAME environment variable, then that will be used instead of running the hostname command.

Note that VESPA_HOSTNAME will be used both when a node identifes itself to the config server and when a service on that node registers a network connection point that other services can connect to.

You should see an error message with "hostname detection failed" if the VESPA_HOSTNAME isn't set and the hostname isn't usable. If you set VESPA_HOSTNAME to something that cannot work you will get an error with "hostname validation failed" instead.

Simple single-node development environment

If you're just testing a Vespa configuration on a single-node setup, you can usually avoid some setup hassle by overriding the hostname with the value "localhost". Try this command for that purpose:
echo "override VESPA_HOSTNAME localhost" >> /opt/vespa/conf/vespa/default-env.txt

Running Java unit tests won't pick up settings in default-env.txt and will default to "localhost" if VESPA_HOSTNAME isn't set in the environment.