# Administrative Procedures

## Install

Refer to the multinode install for a primer on how to set up a cluster. Try the Multinode testing and observability sample app to get familiar with interfaces and behavior. Required architecture is x86_64.

All Vespa processes have a PID file $VESPA_HOME/var/run/{service name}.pid, where {service name} is the Vespa service name, e.g. container or distributor. It is the same name which is used in the administration interface in the config sentinel. Learn how to start / stop nodes. ## Status pages Vespa service instances have status pages for debugging and testing. Status pages are subject to change at any time - take care when automating. Procedure: 1. Find the port: The status pages runs on ports assigned by Vespa. To find status page ports, use vespa-model-inspect to list the services run in the application. $ vespa-model-inspect services

To find the status page port for a specific node for a specific service, pick the correct service and run:
$vespa-model-inspect service [Options] <service-name>  2. Get the status and metrics: distributor, storagenode, searchnode and container-clustercontroller are content services with status pages. These ports are tagged HTTP. The cluster controller have multiple ports tagged HTTP, where the port tagged STATE is the one with the status page. Try connecting to the root at the port, or /state/v1/metrics. The distributor and storagenode status pages are available at /: $ vespa-model-inspect service searchnode

searchnode @ myhost.mydomain.com : search
search/search/cluster.search/0
tcp/myhost.mydomain.com:19110 (STATUS ADMIN RTC RPC)
tcp/myhost.mydomain.com:19111 (FS4)
tcp/myhost.mydomain.com:19112 (TEST HACK SRMP)
tcp/myhost.mydomain.com:19113 (ENGINES-PROVIDER RPC)
tcp/myhost.mydomain.com:19114 (HEALTH JSON HTTP)
$curl http://myhost.mydomain.com:19114/state/v1/metrics ...$ vespa-model-inspect service distributor
distributor @ myhost.mydomain.com : content
search/distributor/0
tcp/myhost.mydomain.com:19116 (MESSAGING)
tcp/myhost.mydomain.com:19117 (STATUS RPC)
tcp/myhost.mydomain.com:19118 (STATE STATUS HTTP)
$curl http://myhost.mydomain.com:19118/state/v1/metrics ...$ curl http://myhost.mydomain.com:19118/
...

3. Use the cluster controller status page: A status page for the cluster controller is available at the status port at http://hostname:port/clustercontroller-status/v1/<clustername>. If clustername is not specified, the available clusters will be listed. The cluster controller leader status page will show if any nodes are operating with differing cluster state versions. It will also show how many data buckets are pending merging (document set reconciliation) due to either missing or being out of sync.

## Clean start mode

There has been rare occasions were Vespa storead data that was internally inconsistent. For those circumstances it is possible to start the node in a validate_and_sanitize_docstore mode. This will do its best to clean up inconsistent data. However detecting that this is required is not easy, consult the Vespa Team first. In order for this approach to work, all nodes must be stopped before enabling this feature - this to make sure the data is not redistributed.

## Content cluster configuration

Availability vs resources Keeping index structures costs resources. Not all replicas of buckets are necessarily searchable, unless configured using searchable-copies. As Vespa indexes buckets on-demand, the most cost-efficient setting is 1, if one can tolerate temporary coverage loss during node failures. Note that searchable-copies does not apply to streaming search as this does not use index structures. When a document is removed, the document data is not immediately purged. Instead, remove-entries (tombstones of removed documents) are kept for a configurable amount of time. The default is two weeks, refer to pruneremoveddocumentsage. This ensures that removed documents stay removed in a distributed system where nodes change state. Entries are removed periodically after expiry. Hence, if a node comes back up after being down for more than two weeks, removed documents are available again, unless the data on the node is wiped first. A larger pruneremoveddocumentsage will hence grow the storage size as this keeps document and tombstones longer. Note: The backend does not store remove-entries for nonexistent documents. This to prevent clients sending wrong document identifiers from filling a cluster with invalid remove-entries. A side-effect is that if a problem has caused all replicas of a bucket to be unavailable, documents in this bucket cannot be marked removed until at least one replica is available again. Documents are written in new bucket replicas while the others are down - if these are removed, then older versions of these will not re-emerge, as the most recent change wins. See transition-time for tradeoffs for how quickly nodes are set down vs. system stability. One can configure how many times a node is allowed to crash before it will automatically be removed. The crash count is reset if the node has been up or down continuously for more than the stable state period. If the crash count exceeds max premature crashes, the node will be disabled. Refer to troubleshooting. A cluster is typically sized to handle a given load. A given percentage of the cluster resources are required for normal operations, and the remainder is the available resources that can be used if some of the nodes are no longer usable. If the cluster loses enough nodes, it will be overloaded: Remaining nodes may create disk full situation. This will likely fail a lot of write operations, and if disk is shared with OS, it may also stop the node from functioning. Partition queues will grow to maximum size. As queues are processed in FIFO order, operations are likely to get long latencies. Many operations may time out while being processed, causing the operation to be resent, adding more load to the cluster. When new nodes are added, they cannot serve requests before data is moved to the new nodes from the already overloaded nodes. Moving data puts even more load on the existing nodes, and as moving data is typically not high priority this may never actually happen. To configure what the minimal cluster size is, use min-distributor-up-ratio and min-storage-up-ratio.