Monitoring

Vespa provides metrics integration with CloudWatch, Datadog and Prometheus / Grafana, as well as a JSON HTTP API. See monitoring with Grafana quick start if you just want to get started monitoring your system.

There are two main approaches to transfer metrics to an external system:

  • Have the external system pull metrics from Vespa
  • Make Vespa push metrics to the external system

Pulling metrics from Vespa

All pull-based solutions use Vespa's metrics API, which provides metrics in JSON format, either for the full system or for a single node.

CloudWatch

Metrics can be pulled into CloudWatch from both Vespa Cloud and self-hosted Vespa. The recommended solution is to use an AWS lambda function, as described in Pulling Vespa metrics to Cloudwatch.

Datadog

Note: This method currently works for self-hosted Vespa only.

The Vespa team has created a Datadog Agent integration to allow real-time monitoring of Vespa in Datadog. The Datadog Vespa integration is not packaged with the agent, but is included in Datadog's integrations-extras repository. Clone it and follow the steps in the README.

Prometheus

Vespa exposes metrics in a text based format that can be scraped by Prometheus. For Vespa Cloud, append /prometheus/v1/values to your endpoint URL. For self-hosted Vespa the URL is: http://<container-host>:<port>/prometheus/v1/values, where the port is the same as for searching, e.g. 8080. Metrics for each individual host can also be retrieved at http://host:19092/prometheus/v1/values.

See the quick-start for a Prometheus / Grafana example.

Pushing metrics to CloudWatch

Note: This method currently works for self-hosted Vespa only.

This is presumably the most convenient way to monitor Vespa in CloudWatch. Steps / requirements:

  1. An IAM user or IAM role that only has the putMetricData permission.
  2. Store the credentials for the above user or role in a shared credentials file on each Vespa node. If a role is used, provide a mechanism to keep the credentials file updated when keys are rotated.
  3. Configure Vespa to push metrics to CloudWatch - example configuration for the admin section in services.xml:
    <metrics>
        <consumer id="my-cloudwatch">
          <metric-set id="default" />
          <cloudwatch region="us-east-1" namespace="my-vespa-metrics">
              <shared-credentials file="/path/to/credentials-file" />
          </cloudwatch>
        </consumer>
    </metrics>
    
    This configuration sends the default set of Vespa metrics to the CloudWatch namespace my-vespa-metrics in the us-east-1 region. Refer to the metric list for default metric set.