vds.idealstate.buckets_rechecking |
The number of buckets that we are rechecking for ideal state operations |
bucket |
vds.idealstate.idealstate_diff |
A number representing the current difference from the ideal state. This is a number that decreases steadily as the system is getting closer to the ideal state |
bucket |
vds.idealstate.buckets_toofewcopies |
The number of buckets the distributor controls that have less than the desired redundancy |
bucket |
vds.idealstate.buckets_toomanycopies |
The number of buckets the distributor controls that have more than the desired redundancy |
bucket |
vds.idealstate.buckets |
The number of buckets the distributor controls |
bucket |
vds.idealstate.buckets_notrusted |
The number of buckets that have no trusted copies. |
bucket |
vds.idealstate.bucket_replicas_moving_out |
Bucket replicas that should be moved out, e.g. retirement case or node added to cluster that has higher ideal state priority. |
bucket |
vds.idealstate.bucket_replicas_copying_out |
Bucket replicas that should be copied out, e.g. node is in ideal state but might have to provide data other nodes in a merge |
bucket |
vds.idealstate.bucket_replicas_copying_in |
Bucket replicas that should be copied in, e.g. node does not have a replica for a bucket that it is in ideal state for |
bucket |
vds.idealstate.bucket_replicas_syncing |
Bucket replicas that need syncing due to mismatching metadata |
bucket |
vds.idealstate.max_observed_time_since_last_gc_sec |
Maximum time (in seconds) since GC was last successfully run for a bucket. Aggregated max value across all buckets on the distributor. |
second |
vds.idealstate.delete_bucket.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.delete_bucket.done_failed |
The number of operations that failed |
operation |
vds.idealstate.delete_bucket.pending |
The number of operations pending |
operation |
vds.idealstate.delete_bucket.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.delete_bucket.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.idealstate.merge_bucket.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.merge_bucket.done_failed |
The number of operations that failed |
operation |
vds.idealstate.merge_bucket.pending |
The number of operations pending |
operation |
vds.idealstate.merge_bucket.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.merge_bucket.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.idealstate.merge_bucket.source_only_copy_changed |
The number of merge operations where source-only copy changed |
operation |
vds.idealstate.merge_bucket.source_only_copy_delete_blocked |
The number of merge operations where delete of unchanged source-only copies was blocked |
operation |
vds.idealstate.merge_bucket.source_only_copy_delete_failed |
The number of merge operations where delete of unchanged source-only copies failed |
operation |
vds.idealstate.split_bucket.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.split_bucket.done_failed |
The number of operations that failed |
operation |
vds.idealstate.split_bucket.pending |
The number of operations pending |
operation |
vds.idealstate.split_bucket.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.split_bucket.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.idealstate.join_bucket.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.join_bucket.done_failed |
The number of operations that failed |
operation |
vds.idealstate.join_bucket.pending |
The number of operations pending |
operation |
vds.idealstate.join_bucket.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.join_bucket.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.idealstate.garbage_collection.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.garbage_collection.done_failed |
The number of operations that failed |
operation |
vds.idealstate.garbage_collection.pending |
The number of operations pending |
operation |
vds.idealstate.garbage_collection.documents_removed |
Number of documents removed by GC operations |
document |
vds.idealstate.garbage_collection.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.garbage_collection.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.distributor.puts.latency |
The latency of put operations |
millisecond |
vds.distributor.puts.ok |
The number of successful put operations performed |
operation |
vds.distributor.puts.failures.total |
Sum of all failures |
operation |
vds.distributor.puts.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.puts.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.puts.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.puts.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.puts.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.puts.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.puts.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.puts.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.puts.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.puts.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.puts.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.removes.latency |
The latency of remove operations |
millisecond |
vds.distributor.removes.ok |
The number of successful removes operations performed |
operation |
vds.distributor.removes.failures.total |
Sum of all failures |
operation |
vds.distributor.removes.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.removes.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.removes.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.removes.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.removes.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.removes.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.removes.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.removes.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.removes.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.removes.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.removes.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.updates.latency |
The latency of update operations |
millisecond |
vds.distributor.updates.ok |
The number of successful updates operations performed |
operation |
vds.distributor.updates.failures.total |
Sum of all failures |
operation |
vds.distributor.updates.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.updates.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.updates.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.updates.diverging_timestamp_updates |
Number of updates that report they were performed against divergent version timestamps on different replicas |
operation |
vds.distributor.updates.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.updates.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.updates.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.updates.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.updates.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.updates.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.updates.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.updates.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.updates.fast_path_restarts |
Number of safe path (write repair) updates that were restarted as fast path updates because all replicas returned documents with the same timestamp in the initial read phase |
operation |
vds.distributor.removelocations.ok |
The number of successful removelocations operations performed |
operation |
vds.distributor.removelocations.failures.total |
Sum of all failures |
operation |
vds.distributor.removelocations.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.removelocations.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.removelocations.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.removelocations.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.removelocations.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.removelocations.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.removelocations.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.removelocations.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.removelocations.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.removelocations.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.removelocations.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.removelocations.latency |
The average latency of removelocations operations |
millisecond |
vds.distributor.gets.latency |
The average latency of gets operations |
millisecond |
vds.distributor.gets.ok |
The number of successful gets operations performed |
operation |
vds.distributor.gets.failures.total |
Sum of all failures |
operation |
vds.distributor.gets.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.gets.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.gets.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.gets.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.gets.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.gets.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.gets.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.gets.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.gets.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.gets.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.gets.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.visitor.latency |
The average latency of visitor operations |
millisecond |
vds.distributor.visitor.ok |
The number of successful visitor operations performed |
operation |
vds.distributor.visitor.failures.total |
Sum of all failures |
operation |
vds.distributor.visitor.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.visitor.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.visitor.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.visitor.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.visitor.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.visitor.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.visitor.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.visitor.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.visitor.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.visitor.bytes_per_visitor |
The number of bytes visited on content nodes as part of a single client visitor command |
operation |
vds.distributor.visitor.docs_per_visitor |
The number of documents visited on content nodes as part of a single client visitor command |
operation |
vds.distributor.visitor.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.visitor.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.docsstored |
Number of documents stored in all buckets controlled by this distributor |
document |
vds.distributor.bytesstored |
Number of bytes stored in all buckets controlled by this distributor |
byte |
metricmanager.periodichooklatency |
Time in ms used to update a single periodic hook |
millisecond |
metricmanager.resetlatency |
Time in ms used to reset all metrics. |
millisecond |
metricmanager.sleeptime |
Time in ms worker thread is sleeping |
millisecond |
metricmanager.snapshothooklatency |
Time in ms used to update a single snapshot hook |
millisecond |
metricmanager.snapshotlatency |
Time in ms used to take a snapshot |
millisecond |
vds.distributor.activate_cluster_state_processing_time |
Elapsed time where the distributor thread is blocked on merging pending bucket info into its bucket database upon activating a cluster state |
millisecond |
vds.distributor.bucket_db.memory_usage.allocated_bytes |
The number of allocated bytes |
byte |
vds.distributor.bucket_db.memory_usage.dead_bytes |
The number of dead bytes (<= used_bytes) |
byte |
vds.distributor.bucket_db.memory_usage.onhold_bytes |
The number of bytes on hold |
byte |
vds.distributor.bucket_db.memory_usage.used_bytes |
The number of used bytes (<= allocated_bytes) |
byte |
vds.distributor.getbucketlists.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.getbucketlists.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.getbucketlists.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.getbucketlists.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.getbucketlists.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.getbucketlists.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.getbucketlists.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.getbucketlists.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.getbucketlists.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.getbucketlists.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.getbucketlists.failures.total |
Total number of failures |
operation |
vds.distributor.getbucketlists.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.getbucketlists.latency |
The average latency of getbucketlists operations |
millisecond |
vds.distributor.getbucketlists.ok |
The number of successful getbucketlists operations performed |
operation |
vds.distributor.recoverymodeschedulingtime |
Time spent scheduling operations in recovery mode after receiving new cluster state |
millisecond |
vds.distributor.set_cluster_state_processing_time |
Elapsed time where the distributor thread is blocked on processing its bucket database upon receiving a new cluster state |
millisecond |
vds.distributor.state_transition_time |
Time it takes to complete a cluster state transition. If a state transition is preempted before completing, its elapsed time is counted as part of the total time spent for the final, completed state transition |
millisecond |
vds.distributor.stats.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.stats.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.stats.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.stats.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.stats.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.stats.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.stats.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.stats.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.stats.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.stats.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.stats.failures.total |
The total number of failures |
operation |
vds.distributor.stats.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.stats.latency |
The average latency of stats operations |
millisecond |
vds.distributor.stats.ok |
The number of successful stats operations performed |
operation |
vds.distributor.update_gets.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.update_gets.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.update_gets.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.update_gets.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.update_gets.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.update_gets.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.update_gets.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.update_gets.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.update_gets.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.update_gets.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.update_gets.failures.total |
The total number of failures |
operation |
vds.distributor.update_gets.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.update_gets.latency |
The average latency of update_gets operations |
millisecond |
vds.distributor.update_gets.ok |
The number of successful update_gets operations performed |
operation |
vds.distributor.update_metadata_gets.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.update_metadata_gets.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.update_metadata_gets.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.update_metadata_gets.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.update_metadata_gets.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.update_metadata_gets.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.update_metadata_gets.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.update_metadata_gets.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.update_metadata_gets.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.update_metadata_gets.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.update_metadata_gets.failures.total |
The total number of failures |
operation |
vds.distributor.update_metadata_gets.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.update_metadata_gets.latency |
The average latency of update_metadata_gets operations |
millisecond |
vds.distributor.update_metadata_gets.ok |
The number of successful update_metadata_gets operations performed |
operation |
vds.distributor.update_puts.failures.busy |
The number of messages from storage that failed because the storage node was busy |
operation |
vds.distributor.update_puts.failures.concurrent_mutations |
The number of operations that were transiently failed due to a mutating operation already being in progress for its document ID |
operation |
vds.distributor.update_puts.failures.inconsistent_bucket |
The number of operations failed due to buckets being in an inconsistent state or not found |
operation |
vds.distributor.update_puts.failures.notconnected |
The number of operations discarded because there were no available storage nodes to send to |
operation |
vds.distributor.update_puts.failures.notfound |
The number of operations that failed because the document did not exist |
operation |
vds.distributor.update_puts.failures.notready |
The number of operations discarded because distributor was not ready |
operation |
vds.distributor.update_puts.failures.safe_time_not_reached |
The number of operations that were transiently failed due to them arriving before the safe time point for bucket ownership handovers has passed |
operation |
vds.distributor.update_puts.failures.storagefailure |
The number of operations that failed in storage |
operation |
vds.distributor.update_puts.failures.test_and_set_failed |
The number of mutating operations that failed because they specified a test-and-set condition that did not match the existing document |
operation |
vds.distributor.update_puts.failures.timeout |
The number of operations that failed because the operation timed out towards storage |
operation |
vds.distributor.update_puts.failures.total |
The total number of put failures |
operation |
vds.distributor.update_puts.failures.wrongdistributor |
The number of operations discarded because they were sent to the wrong distributor |
operation |
vds.distributor.update_puts.latency |
The average latency of update_puts operations |
millisecond |
vds.distributor.update_puts.ok |
The number of successful update_puts operations performed |
operation |
vds.idealstate.nodes_per_merge |
The number of nodes involved in a single merge operation. |
node |
vds.idealstate.set_bucket_state.blocked |
The number of operations blocked by blocking operation starter |
operation |
vds.idealstate.set_bucket_state.done_failed |
The number of operations that failed |
operation |
vds.idealstate.set_bucket_state.done_ok |
The number of operations successfully performed |
operation |
vds.idealstate.set_bucket_state.pending |
The number of operations pending |
operation |
vds.idealstate.set_bucket_state.throttled |
The number of operations throttled by throttling operation starter |
operation |
vds.bouncer.clock_skew_aborts |
Number of client operations that were aborted due to clock skew between sender and receiver exceeding acceptable range |
operation |