State API
The cluster controller has a RESTful API for viewing and modifying the state of a content cluster. To find the URL to access the State API, identify the cluster controller services in your application. Only the master cluster controller will be able to respond. The master cluster controller is the cluster controller alive that has the lowest index. Thus, you will typically use cluster controller 0, but if you fail to contact it, you need to try number 1 and so on. Using the vespa-model-inspect command line tool:
$ vespa-model-inspect service -u container-clustercontroller container-clustercontroller @ hostname.domain.com : admin admin/cluster-controllers/0 http://hostname.domain.com:19050/ (STATE EXTERNAL QUERY HTTP) http://hostname.domain.com:19117/ (EXTERNAL HTTP) tcp/hostname.domain.com:19118 (MESSAGING RPC) tcp/hostname.domain.com:19119 (ADMIN RPC)
In this example, there is only one clustercontroller, and the State Rest API is available on the port marked STATE and HTTP, namely 19050 in this example. This information can also be retrieved through the model config in the config server.
Types
Type | Spec | Example | Description |
---|---|---|---|
The name given to a content cluster in a Vespa application. | |||
Some \"JSON escaped\" text | Description can contain anything that is valid JSON. However, as the information is presented in various interfaces, some which may present reasons for all the states in a cluster or similar, keeping it short and to the point makes it easier to fit the information neatly into a table and get a better cluster overview. | ||
The hierarchical group assignment of a given content node. This is a dot separated list of identifiers given in the application services.xml configuration. | |||
The index or distribution key identifying a given node within the context of a content cluster and a service type. | |||
The index of a partition within the context of a content cluster, a service type and a node. | |||
The type of the service to look at state for, within the context of a given content cluster. | |||
One of the valid disk states. | |||
One of the valid node generated states. | |||
One of the valid node unit states. | |||
One of the valid node user states. |
Errors
Errors will be indicated using the HTTP status codes. An error response from the State API will include a JSON encoded error response with extra information. As a request may fail outside of the State API, we can not guarantee that such a JSON representation exist though. To make it simpler for clients, all errors they need to handle specifically should be specified in HTTP error codes. Thus the content of the JSON error report, if it exist, can be left unspecified, and just used to improve an error report if needed. Note: Do not depend on the JSON content for anything other than improving error reports - contents may change at any time
Cluster controller not master - master known
This error means you are talking to the wrong cluster controller. This will give you a standard HTTP redirect, so your HTTP client may automatically redo the request on the correct cluster controller. If it does not you might want to handle this specifically.
Note that since we know the cluster controller available with the lowest index will be the master, you will typically try to query the cluster controllers in index order, in which case you are unlikely to ever get this error, but rather fail to connect to the cluster controller if it is not the current master.
HTTP/1.1 303 See Other Location: http://<master>/<current-request> Content-Type: application/json { "message" : "Cluster controller index not master. Use master at index index. }
Cluster controller not master - unknown or no master
This error is used if the cluster controller asked is not master, and it doesn't know who the master is. This can happen, for instance in a network split where cluster controller 0 no longer can reach cluster controller 1 and 2, in which case cluster controller 0 knows it is not master as it can't see the majority, and cluster controller 1 and 2 will vote 1 to master.
HTTP/1.1 503 Service Unavailable Content-Type: application/json { "message" : "No known master cluster controller currently exist." }
Recursive mode
To use recursive mode, specify the recursive URL parameter, and give it a numeric value for number of levels. A value of true is also valid, this returns all levels. Examples: Use recursive=1 for a node request to also see all the partition data, use recursive=2 to see all the node data within each service type, without getting all the partition information too.
In recursive mode, you will see the same output as you find in the spec below. However, where there is a { "link" : "<url-path>" } element, this element will be replaced by the content of that request, given a recursive value of one less than the request above.
Functions
Here follows a list of all the available functions. Note that more headers than the ones specified will exist. All requests with content will obviously have a content length for instance.
List the existing content clusters
HTTP GET /cluster/v2
Example success result:
HTTP/1.1 200 OK Content-Type: application/json { "cluster" : { "music" : { "link" : "/cluster/v2/music" }, "books" : { "link" : "/cluster/v2/books" } } }
Get cluster state and list the various service types within the cluster
HTTP GET /cluster/v2/<cluster>
Example success result:
HTTP/1.1 200 OK Content-Type: application/json { "state" : { "generated" : { "state" : "<state-generated>", "reason" : "<description>" } } "service" : { "distributor" : { "link" : "/cluster/v2/mycluster/distributor" }, "storage" : { "link" : "/cluster/v2/mycluster/storage" } { }
List the nodes of a given service type for a cluster
HTTP GET /cluster/v2/<cluster>/<service-type>
Example success result:
HTTP/1.1 200 OK Content-Type: application/json { "node" : { "0" : { "link" : "/cluster/v2/mycluster/storage/0" }, "1" : { "link" : "/cluster/v2/mycluster/storage/1" } } }
Get node state
HTTP GET /cluster/v2/<cluster>/<service-type>/<node>
Example success result:
HTTP/1.1 200 OK Content-Type: application/json { "attributes" : { "hierarchical-group" : "<group-spec>" }, "partition" : { "0" : { "link" : "/cluster/v2/mycluster/storage/0/0" } }, "state" : { "unit" : { "state" : "<state-unit>", "reason" : "<description>" }, "generated" : { "state" : "<state-generated>", "reason" : "<description>" }, "user" : { "state" : "<state-user>", "reason" : "<description>" } } }
Get partition state
HTTP GET /cluster/v2/<cluster>/<service-type>/<node>/<partition>
Example success result:
HTTP/1.1 200 OK Content-Type: application/json { "metrics" : { "bucket-count" : <integer>, "unique-document-count" : <integer>, "unique-document-total-size" : <integer> }, "state" : { "generated" : { "state" : "<state-disk>", "reason" : "<description>" }, } }
Set node user state
HTTP PUT /cluster/v2/<cluster>/<service-type>/<node> Content-Type: application/json { "state" : { "user" : { "state" : "retired", "reason" : "This colo will be removed soon" } } }
Success result:
HTTP/1.1 200 OK Content-Type: application/json { "wasModified": true, "reason": "ok" }