State Rest API

The cluster controller has a RESTful API for viewing and modifying the state of a content cluster. To find the URL to access the State API, identify the cluster controller services in your application. Only the master cluster controller will be able to respond. The master cluster controller is the cluster controller alive that has the lowest index. Thus, you will typically use cluster controller 0, but if you fail to contact it, you need to try number 1 and so on. Using the vespa-model-inspect command line tool:

$ vespa-model-inspect service -u container-clustercontroller
container-clustercontroller @ hostname.domain.com : admin
admin/cluster-controllers/0
    http://hostname.domain.com:19050/ (STATE EXTERNAL QUERY HTTP)
    http://hostname.domain.com:19117/ (EXTERNAL HTTP)
    tcp/hostname.domain.com:19118 (MESSAGING RPC)
    tcp/hostname.domain.com:19119 (ADMIN RPC)

In this example, there is only one clustercontroller, and the State Rest API is available on the port marked STATE and HTTP, namely 19050 in this example. This information can also be retrieved through the model config in the config server.

Types

Type Spec Example Description
cluster <identifier> music The name given to a content cluster in a Vespa application.
description .* Some \"JSON escaped\" text Description can contain anything that is valid JSON. However, as the information is presented in various interfaces, some which may present reasons for all the states in a cluster or similar, keeping it short and to the point makes it easier to fit the information neatly into a table and get a better cluster overview.
group-spec <identifier>(\.<identifier>)* asia.switch0 The hierarchical group assignment of a given content node. This is a dot separated list of identifiers given in the application services.xml configuration.
node [0-9]+ 0 The index or distribution key identifying a given node within the context of a content cluster and a service type.
partition [0-9]+ 0 The index of a partition within the context of a content cluster, a service type and a node.
service-type (distributor|storage) distributor The type of the service to look at state for, within the context of a given content cluster.
state-disk (up|down) up One of the valid disk states.
state-generated (initializing|up|down|retired|maintenance) up One of the valid node generated states.
state-unit (initializing|up|stopping|down) up One of the valid node unit states.
state-user (up|down|retired|maintenance) up One of the valid node user states.

Errors

Errors will be indicated using the HTTP status codes. An error response from the State API will include a JSON encoded error response with extra information. As a request may fail outside of the State API, we can not guarantee that such a JSON representation exist though. To make it simpler for clients, all errors they need to handle specifically should be specified in HTTP error codes. Thus the content of the JSON error report, if it exist, can be left unspecified, and just used to improve an error report if needed. Note: Do not depend on the JSON content for anything other than improving error reports - contents may change at any time

Cluster controller not master - master known

This error means you are talking to the wrong cluster controller. This will give you a standard HTTP redirect, so your HTTP client may automatically redo the request on the correct cluster controller. If it does not you might want to handle this specifically.

Note that since we know the cluster controller available with the lowest index will be the master, you will typically try to query the cluster controllers in index order, in which case you are unlikely to ever get this error, but rather fail to connect to the cluster controller if it is not the current master.

    HTTP/1.1 303 See Other
    Location: http://<master>/<current-request>
    Content-Type: application/json

    {
        "message" : "Cluster controller index not master. Use master at index index.
    }

Cluster controller not master - unknown or no master

This error is used if the cluster controller asked is not master, and it doesn't know who the master is. This can happen, for instance in a network split where cluster controller 0 no longer can reach cluster controller 1 and 2, in which case cluster controller 0 knows it is not master as it can't see the majority, and cluster controller 1 and 2 will vote 1 to master.

    HTTP/1.1 503 Service Unavailable
    Content-Type: application/json

    {
        "message" : "No known master cluster controller currently exist."
    }

Recursive mode

To use recursive mode, specify the recursive URL parameter, and give it a numeric value for number of levels. A value of true is also valid, this returns all levels. Examples: Use recursive=1 for a node request to also see all the partition data, use recursive=2 to see all the node data within each service type, without getting all the partition information too.

In recursive mode, you will see the same output as you find in the spec below. However, where there is a { "link" : "<url-path>" } element, this element will be replaced by the content of that request, given a recursive value of one less than the request above.

Functions

Here follows a list of all the available functions. Note that more headers than the ones specified will exist. All requests with content will obviously have a content length for instance.

List the existing content clusters

    HTTP GET /cluster/v2

Example success result:

    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "cluster" : {
            "music" : {
                "link" : "/cluster/v2/music"
            },
            "books" : {
                "link" : "/cluster/v2/books"
            }
        }
    }

Get cluster state and list the various service types within the cluster

    HTTP GET /cluster/v2/<cluster>

Example success result:

    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "state" : {
            "generated" : {
                "state" : "<state-generated>",
                "reason" : "<description>"
            }
        }
        "service" : {
            "distributor" : {
                "link" : "/cluster/v2/mycluster/distributor"
            },
            "storage" : {
                "link" : "/cluster/v2/mycluster/storage"
            }
        {
    }

List the nodes of a given service type for a cluster

    HTTP GET /cluster/v2/<cluster>/<service-type>

Example success result:

    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "node" : {
            "0" : {
                "link" : "/cluster/v2/mycluster/storage/0"
            },
            "1" : {
                "link" : "/cluster/v2/mycluster/storage/1"
            }
        }
    }

Get node state

    HTTP GET /cluster/v2/<cluster>/<service-type>/<node>

Example success result:

    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "attributes" : {
            "hierarchical-group" : "<group-spec>"
        },
        "partition" : {
            "0" : {
                "link" : "/cluster/v2/mycluster/storage/0/0"
            }
        },
        "state" : {
            "unit" : {
                "state" : "<state-unit>",
                "reason" : "<description>"
            },
            "generated" : {
                "state" : "<state-generated>",
                "reason" : "<description>"
            },
            "user" : {
                "state" : "<state-user>",
                "reason" : "<description>"
            }
        }
    }

Get partition state

    HTTP GET /cluster/v2/<cluster>/<service-type>/<node>/<partition>

Example success result:

    HTTP/1.1 200 OK
    Content-Type: application/json

    {
        "metrics" : {
            "bucket-count" : <integer>,
            "unique-document-count" : <integer>,
            "unique-document-total-size" : <integer> 
        },
        "state" : {
            "generated" : {
                "state" : "<state-disk>",
                "reason" : "<description>"
            },
        }
    }

Set node user state

    HTTP PUT /cluster/v2/<cluster>/<service-type>/<node>

Example success result:

    Content-Type: application/json

    {
        "state" : {
            "user" : {
                "state" : "retired",
                "reason" : "This colo will be removed soon"
            }
        }
    }