# Profiling

Guidelines when profiling:

• Define clearly what to profile.
• Find a load that represents what to profile. This is often the hardest part, as there is a lot of noise if stressing other components.
• Make sure that there are no other bottlenecks that blocks stressing the profiled component. It makes little sense to do cpu profiling if the network is the limitation.
• If possible, write special unit-tests like benchmark programs that stress exactly what to profile.
• If the system is multithreaded:
• Always profile single threaded first - that gives a baseline for doing the scaling tests. Verify one is utilizing as many cores as expected.
• Increase scaling gradually to at least 2x numcores or until throughput degrades.

Also see using valgrind with Vespa.

vmstat vmstat can be used to figure out what kind of resources are used: cpu usage split in user, system, idle, and io wait: system should be low(<10) swap in/out: should be zero. Note: A maxed out system should have either maxed out disks or cpu (idle == 0). If not, there might be locks. Example: $vmstat 1 procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 5628 3315460 304024 23008616 0 0 14 34 0 0 0 0 99 0 1 0 5628 3298884 304024 23008640 0 0 0 396 33 4615 9 1 90 0 0 0 5628 3316336 304028 23008644 0 0 0 0 15 4469 4 1 95 0 0 0 5628 3316592 304028 23008644 0 0 0 0 24 4364 0 0 100 0 0 0 5628 3316592 304028 23008644 0 0 0 2948 20 4305 0 0 100 0 0 0 5628 3316468 304028 23008644 0 0 0 0 22 4259 0 0 100 0 0 0 5628 3316468 304028 23008644 0 0 0 180 20 4279 0 0 100 0 0 0 5628 3316468 304028 23008644 0 0 0 0 26 4349 0 0 100 0 16 0 5628 3284236 304056 23008688 0 0 12 188 17 9196 38 2 60 0 19 0 5628 3267020 304056 23008732 0 0 8 128 44 6408 99 1 0 0 16 0 5628 3245472 304060 23008840 0 0 20 0 9 7191 99 1 0 0 17 0 5628 3227784 304060 23008872 0 0 20 0 27 6420 99 1 0 0  Use top to see which applications consume cpu and memory. ## CPU Profiling using perf Sometimes, when debugging cpu usage in a remote cluster and debugging performance, it might be beneficial to get a performance profile snapshot. To use perf, install vespa-debuginfo-<vespa-version> matching the Vespa version, example with 7.147.12: $ sudo yum install vespa-debuginfo-7.147.12

Record:
$sudo perf record --pid=  The pid of the vespa-proton-bin process can be obtained using vespa-sentinel-cmd, or top/ps. To get a performance profile report: $ sudo perf report

Sometimes it's useful to have kernel debug info installed to get symbol info for the Linux kernel:
\$ sudo yum install kernel-debuginfo

Its important to get somewhat same version of kernel-debuginfo as the kernel package.