This guide goes through the following aspects of node resource configuration:
In Vespa Cloud, a node's resources
is configured like:
<nodes count="8">
<resources vcpu="4" memory="16Gb" disk="300Gb"/>
</nodes>
With this, you specify the dimensions independently. E.g., one can double the CPU, keeping all other dimensions constant.
This is important when tuning for the optimal price/performance point, as the pieces of an application has different sweet spots. For example, the product search cluster of an application can be more CPU bound than product recommendations; the latter might need relatively more memory.
Optimizing for cost/performance is therefore easy. Simplified, applications can be CPU, disk, or memory bound. A general rule of thumb is to be bound by the most expensive component, often CPU. Refer to the node resource reference for all dimensions.
Applications change over time:
Finding the optimal node configuration is an iterative process. It is simplified by using the Resource Suggestions view in the Vespa Console:
Vespa Cloud tracks usage over time and suggests node configuration and topology changes based on last week's load. In the example above, observe a suggestion that doubles the memory relative to CPU.
This simplifies what to configure, and one can roll out isolated changes while observing latency and other business metrics like relevance quality.
Resource configuration is part of the application package. To change a cluster's resources, deploy the new version of the application package to Vespa Cloud and wait for the changes to apply:
count
will modify the existing cluster.resources
configuration will set up a parallel cluster and migrate data to it.
This is generally slower than changing the node count, as more data moves.Making changes to the resource specifications is hence fully automated. The quickest way to the sweet spot is to initially deploy with enough capacity and do daily re-tuning to cut cost.
Vespa Cloud provides performance dashboards with the relevant metrics in this phase:
Eventually, the application has its optimal price/performance characteristics, without lengthy benchmarking activities.
Resource configurations map to the cloud provider's real resources, like AWS EC2 compute instances. The instance inventory develops over time, like:
Both have 16 vCPU and 128G RAM, but r8g_4xlarge is of a newer generation, and has presumably higher performance: "R8g instances deliver around 30% higher performance over R7g instances, …"
resources
configuration is general and independent of instance types,
Vespa Cloud will automatically migrate load to more cost-effective compute instances over time.
This means, Vespa Cloud applications will migrate to more recent instance types of the same configuration, with zero manual interventions. This keeps the total cost in check, and performance tracking advances in hardware.