Statistics for Logical Usage
Swarm calculates storage use by addressable objects to support conventional (storage filer) reporting of file counts and space usage. This approach tracks cluster-wide capacity by counting logical objects (the unique content of uploaded and versioned files) rather than actual streams (the raw space that is consumed by all Swarm components, including replicas, EC segments, context objects, and manifests). The versions add to the object totals; if a cluster held one 20 MB image with 3 replicas and 4 prior versions: 5 logical objects and ~100 MB logical space are consumed.
Swarm continuously sends the nodes updates about the cluster's logical usage (the current number of objects and the space they consume), which the nodes update with the local space-affecting activity. Swarm aggregates these updates (for accuracy) and publishes them using SNMP and REST as logicalObjects
and logicalSpace
. A third statistic, logicalUnprocessed
, exists to provide insight into the other statistics' accuracy (the closer to zero, the more accurate). Swarm propagates this data quickly, so there is little lag behind the cluster activity affecting usage: writes, deletes, and updates. A drop in aggregated estimates is reflected after a disk failure, followed by an increase to the true value, once Volume Recovery recreates the lost streams previously on the disk.
Tip
When first booting the cluster after installing or upgrading to version 9.0, Swarm starts traversing the volumes to build these statistics, so they are not accurate until that completes; the value of logicalUnprocessed
indicates the progression. Expect it to take 1 complete HP cycle to drop logicalUnprocessed
to 0.
Usage via SNMP and REST
Swarm aggregates usage statistics from each volume and publishes them as cluster-wide values:
Aggregates | Units | Description | Accuracy |
---|---|---|---|
logicalObjects | count | The number of unique objects (including historical versions) stored for the entire cluster. Each content object counts only as 1, regardless of the number of replicas or EC segments that comprise it. | Approaches the actual number of logical objects in the cluster, minus context (domain, bucket) objects. NoteLogical counts are estimates, and they are not accurate during volume recovery. The estimating is a consequence of Swarm's robust, no-single-point-of-failure design: Swarm keeps no master list of objects, so counts are inferred from multiple overlapping sources of information. |
logicalSpace | MB | The logical space stored for the entire cluster, including historical versions (which are separate objects). | Includes both the data and the persisted headers on each object, with header newlines counting as two characters (‘\r\n’). EC encoded objects may include a small overage. |
logicalUnprocessed | count | The number of streams in the cluster not accounted for in After implementation, it drops until it catches up, approaching zero. | When compared to the number of streams in the cluster, allows rough verification of other statistics, especially following the first boot after it is implemented. |
Note
These are cluster-level statistics, so each node is publishing the same values.
Get logicalObjects
, logicalSpace
, and logicalUnprocessed
by polling a node using SNMP:
SNMP for usage
snmpget -m +CARINGO-CASTOR-MIB -v2c -M +/usr/share/snmp/mib2c-data -cPASSWORD -OQs
{node- ip} logicalSpace
Get logicalObjects
, logicalSpace
, and logicalUnprocessed
by polling a node using the REST API:
REST call for usage
http://{node- ip}:91/api/storage/clusters/{clustername}
Trends: Each volume in a Swarm cluster is computing partial statistics for logical objects with replicas on other volumes. Swarm works to keep the correct number of replicas (and EC segments) for every object, but, if there are too many replicas, the statistics trend higher. In the case of hardware failure, the statistics trend lower while the recovery is taking place.
Timing: Each volume has accurate partial statistics immediately after a write or delete. REST API statistics are immediately available after each volume broadcasts messages that are sent every 30 seconds, but SNMP adds up to another 60 seconds for periodic polling of the aggregated values. Metrics does not aggregate, so the periodic metrics reports is current with respect to the accounting cursor.
Usage via Metrics
Usage statistics are reported using Swarm's Metrics mechanism. These metrics are checked on demand, at query time. Although Swarm publishes the statistics under the volume metrics, the values represent the cluster level:
Volume Metrics | Units | Description |
---|---|---|
logical_objects | count | The number of unique objects (including historical versions) stored for the entire cluster. Each content object counts only as 1, regardless of the number of replicas or EC segments that comprise it. |
logical_space | bytes | The logical space stored for the entire cluster. The logical_space value is in bytes, not MB, for greater accuracy. |
logical_unprocessed | count | The number of streams (replicas, EC segments, etc.) in the cluster not counted for |
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.