• [+] expand all


Vespa provides metrics integration with CloudWatch, Datadog and Prometheus / Grafana, as well as a JSON HTTP API.

There are two main approaches to transfer metrics to an external system:

  • Have the external system pull metrics from Vespa
  • Make Vespa push metrics to the external system

Pulling metrics from Vespa

All pull-based solutions use Vespa's metrics API, which provides metrics in JSON format, either for the full system or for a single node. The polling frequency should be limited to max once every 30 seconds as more frequent polling would not give increased granularity but only lead to unnecessary load on your systems.


Metrics can be pulled into CloudWatch from both Vespa Cloud and self-hosted Vespa. The recommended solution is to use an AWS lambda function, as described in Pulling Vespa metrics to Cloudwatch.


Note: This method currently works for self-hosted Vespa only.

The Vespa team has created a Datadog Agent integration to allow real-time monitoring of Vespa in Datadog. The Datadog Vespa integration is not packaged with the agent, but is included in Datadog's integrations-extras repository. Clone it and follow the steps in the README.


Vespa exposes metrics in a text based format that can be scraped by Prometheus. For Vespa Cloud, append /prometheus/v1/values to your endpoint URL. For self-hosted Vespa the URL is: http://<container-host>:<port>/prometheus/v1/values, where the port is the same as for searching, e.g. 8080. Metrics for each individual host can also be retrieved at http://host:19092/prometheus/v1/values.

See the quick-start for a Prometheus / Grafana example.

Pushing metrics to CloudWatch

Note: This method currently works for self-hosted Vespa only.

This is presumably the most convenient way to monitor Vespa in CloudWatch. Steps / requirements:

  1. An IAM user or IAM role that only has the putMetricData permission.
  2. Store the credentials for the above user or role in a shared credentials file on each Vespa node. If a role is used, provide a mechanism to keep the credentials file updated when keys are rotated.
  3. Configure Vespa to push metrics to CloudWatch - example configuration for the admin section in services.xml:
        <consumer id="my-cloudwatch">
          <metric-set id="default" />
          <cloudwatch region="us-east-1" namespace="my-vespa-metrics">
              <shared-credentials file="/path/to/credentials-file" />
    This configuration sends the default set of Vespa metrics to the CloudWatch namespace my-vespa-metrics in the us-east-1 region. Refer to the metric list for default metric set.

Monitoring with Grafana

Follow these steps to set up monitoring with Grafana for a Vespa instance. This guide builds on the quick start by adding three more Docker containers and connecting these in the Docker monitoring network:

Docker containers in a Docker network
  1. Run the Quick Start:

    Complete steps 1-6 (or 1-9), but skip step 10 removal.

  2. Create a network and add the vespa container to it:

    $ cd sample-apps/album-recommendation-monitoring
    $ docker network create --driver bridge monitoring
    $ docker network connect monitoring vespa

    This creates the monitoring network and attaches the vespa container to it. Find details in docker-compose.yml.

  3. Launch Prometheus and Grafana:

    $ docker-compose up --detach prometheus
    $ docker-compose up --detach grafana

    This launches Prometheus and Grafana with configurations defined in docker-compose.yml. Prometheus is a time-series database, which holds a series of values associated with a timestamp. Grafana is a visualisation tool that can be used to easily make representations of important metrics surrounding Vespa. The configuration for both of these include some default values and connects the two of them together.

  4. Check that Grafana and Prometheus are running:

    Open http://localhost:3000/ and find the Grafana login screen - log in with admin/admin (skip changing password). From the list on the left, click Manage under Dashboards (the symbol with 4 blocks), then click the Vespa Detailed Monitoring Dashboard. The dashboard displays detailed Vespa metrics. (empty for now).

    Now open Prometheus at http://localhost:9090/. One can easily find what data Prometheus has, the input box auto-completes, e.g. enter feed_operations_rate and click Execute. Also explore the Status dropdown.

  5. Start the Random Data Feeder:

    $ docker-compose up --detach random-data-feeder

    This builds and starts a Random Data Feeder - it generates random sets of data and puts them into the Vespa instance. Also, it repeatedly runs queries, for Grafana visualisation. Compiling the Random Data Feeder takes a few minutes.

  6. Check the updated Grafana metrics:

    Graph will now show up in Grafana and Prometheus - it might take a minute or two. The Grafana dashboard is fully customisable. Change the default modes of Grafana and Prometheus by editing the configuration files in album-recommendation-monitoring.

  7. Remove containers and network:

    $ docker rm -f vespa; docker-compose down      # to remove images as well, add --rmi all