The Cloud Config System can be set up with one or more configuration servers (config servers). A config server uses ZooKeeper as a distributed data storage for the configuration system. In addition, each node runs a config proxy to cache configuration data - find an overview at services start.
Tools to access config:
# services.xml <admin version="2.0"> <configservers> <configserver hostalias="admin0" /> <configserver hostalias="admin1" /> <configserver hostalias="admin2" /> </configservers> </admin> # hosts.xml <hosts> <host name="myserver0.mydomain.com"> <alias>admin0</alias> </host> <host name="myserver1.mydomain.com"> <alias>admin1</alias> </host> <host name="myserver2.mydomain.com"> <alias>admin2</alias> </host> </hosts> # VESPA_CONFIGSERVERS - must be set on all nodes in the application VESPA_CONFIGSERVERS=myserver0.mydomain.com,myserver1.mydomain.com,myserver2.mydomain.com
Refer to the admin model reference. VESPA_CONFIGSERVERS must be set on all nodes. This is a comma or whitespace-separated list with the hostname of all configservers, like myhost1.mydomain.com,myhost2.mydomain.com,myhost3.mydomain.com.
When there are multiple config servers, the config proxy will pick a config server based on a hash of their hostname and how many config servers there are (to achieve load balancing between config servers). The config proxy is fault-tolerant and will switch to another config server if it is unavailable or there is an error in the configuration it receives. With only one config server configured, it will continue using that in case of errors.
For the system to tolerate n failures, ZooKeeper by design requires using (2*n)+1 nodes. Consequently, only an odd numbers of nodes is useful, so you need minimum 3 nodes to have a fault tolerant config system.
Even when using just one config server the config system the application will still work if it goes down (you will of course be unable to deploy application changes). Since the config proxy runs on every node and caches configs, it will continue to serve config to the services on that node. However, restarting a node when config servers are unavailable will lead to a failure of that node, since restarting a node means restarting the config proxy.
Add config server nodes for increased fault tolerance. Procedure:
- Install vespa on the new config server nodes
- Add config server hosts to VESPA_CONFIGSERVERS on all the nodes
- Restart the config server on the old config server hosts and start it on the new ones
- Update services.xml and hosts.xml with the new set of config servers, then vespa-deploy prepare and vespa-deploy activate
- Restart nodes one by one to start using the new config servers.
Scaling up by Majority
When increasing from 1 to 3 nodes or 3 to 7, the blank nodes constitutes a majority in the cluster. After restarting the config servers, they will not contain the old application data, because the blank nodes might win the election - depending on restart timing. To avoid any issues with this, scale up by minor sets of the nodes - example:
- Scale from 1 to 2
- Scale from 2 to 3
Remove config servers from a cluster:
- Remove config server hosts from VESPA_CONFIGSERVERS on all vespa nodes
- Restart nodes one by one to start using the new set of config servers.
- Restart config servers on the new subset
- Verify that these nodes have data, by using vespa-get-config or vespa-zkcli ls (see below). If they are blank, redo vespa-deploy prepare and vespa-deploy activate. Also see health checks.
- Pull removed hosts from production
ZooKeeper handles data consistency across multiple config servers. The config server Java application runs a ZooKeeper server, embedded with an RPC frontend that the other nodes use. ZooKeeper stores data internally in nodes that can have sub-nodes, similar to a file system.
When start/restarting the config server, the configuration file for ZooKeeper, $VESPA_HOME/conf/zookeeper/zookeeper.cfg, is generated based on the contents of VESPA_CONFIGSERVERS. Hence, config server(s) must all be restarted if that changes on a config server node.
At vespa-deploy prepare, the application's files, along with global configurations, are stored in ZooKeeper. The application data is stored under /config/v2/tenants/default/sessions/[sessionid]/userapp. At vespa-deploy activate, the newest application is set live by updating the pointer in /config/v2/tenants/default/FIXME to refer to the active app's timestamp. It is at that point the other nodes get configured.
Use vespa-zkcli to inspect state, replace with actual session id:
$ vespa-zkcli ls /config/v2/tenants/default/sessions/[sessionid]/userapp $ vespa-zkcli get /config/v2/tenants/default/sessions/[sessionid]/userapp/services.xml
The ZooKeeper server logs to $VESPA_HOME/logs/vespa/zookeeper.configserver.0.log (files are rotated with sequence number)
If the config server(s) should experience data corruption, for instance a hardware failure, use the following recovery procedure. One example of such a scenario is if $VESPA_HOME/logs/vespa/zookeeper.configserver.0.log says java.io.IOException: Negative seek offset at java.io.RandomAccessFile.seek(Native Method), which indicates ZooKeeper has not been able to recover after a full disk. There is no need to restart Vespa on other nodes during the procedure. Refer to tools for details:
- stop cloudconfig_server
- start cloudconfig_server
- vespa-deploy prepare <application path>
- vespa-deploy activate
Note that by default the cluster controller that maintains the state of the content cluster will use the shared same ZooKeeper instance, so the content cluster state is also reset when removing state. Manually set state will be lost (e.g. a node with user state down). It is possible to run cluster-controllers in standalone zookeeper mode - see standalone-zookeeper.
ZooKeeper barrier timeout
If the config servers are heavily loaded, or the applications being deployed are big, the internals of the server may time out when synchronizing with the other servers during deploy. To work around, increase the timeout by setting: VESPA_CONFIGSERVER_ZOOKEEPER_BARRIER_TIMEOUT to 600 (seconds) or higher, and restart the config servers.
To access config from a node not running the config system (e.g. doing feeding via the Document API), use the environment variable VESPA_CONFIG_SOURCES:
$ export VESPA_CONFIG_SOURCES="myadmin0.mydomain.com:12345,myadmin1.mydomain.com:12345"Alternatively, for Java programs, use the system property configsources and set it programmatically or on the command line with the -D option to Java. The syntax for the value is the same as for VESPA_CONFIG_SOURCES.
The default heap size for the JVM it runs under is 1.5 Gb (which can be changed with a setting). It writes a transaction log that is regularly purged of old items, so little disk space is required. Note that running on a server that has a lot of disk I/O will adversely effect performance and is not recommended.
The config server RPC port can be changed by setting VESPA_CONFIGSERVER_RPC_PORT on all nodes in the system. Changing HTTP port requires changing the port in $VESPA_HOME/conf/configserver-app/services.xml:
<http> <server port="12345" id="configserver" /> </http>When deploying, use the -p option, if port is changed from the default.
|Health checks||Verify that a config server is up and running using the Health and Metric APIs, like http://myserver0.mydomain.com:12345/state/v1/health. Metrics are found at http://myserver0.mydomain.com:12345/state/v1/metrics. Use vespa-model-inspect to find host and port number|
|Bad Node||If running with more than one config server and one of these goes down or has hardware failure, the cluster will still work and serve config as usual (clients will switch to use one of the good nodes). It is not necessary to remove a bad node from the configuration. Deploying applications will use a long time, since vespa-deploy will not be able to complete a deployment on all servers when one of them is down. If this is troublesome, lower the barrier timeout - (default value is 120 seconds). Note also that if you have not configured cluster controllers explicitly, these will run on the config server nodes and the operation of these might be affected. This is another reason for not trying to manually remove a bad node from the config server setup|
|Stuck filedistribution||The config system distributes binary files (such as jar bundle files) using file-distribution - use vespa-status-filedistribution if it gets stuck|