The Swarm Search API Search API allows you to compute the total size of objects stored in a bucket or domain, by using the du query argument.
EXAMPLE REQUEST AND RESPONSE FOR CALCULATING THE SPACE USED BY ALL CONTENT IN THE DOMAINExample request and response for calculating the space used by all content in the domain:
`HTTP/1.1 200 OK `
Gateway-Request-Id: 26F809F67D883E6D `
Content-Type: application/json; charset=utf-8 `
Castor-Object-Count: 17 `
Castor-Bytes-Used-With-Reps: 121590 `
Date: Tue, 19 Jun 2012 22:00:16 GMT `
Server: CAStor Cluster/ `
Via: (Cloud Gateway/1.1) `
Content-Length: 4 `
[ `
In case you need to determine the space occupied by the different content_types that you have stored in your entire cluster, then you can use the Elasticsearch Indexer's Terms Stats Facets API (in case you user version 0.9.x or lower):
curl -XGET -d @query.json "http://<IP_of_Elasticsearch_Indexer_Node>:9200/<name_of_your_Swarm_cluster?/_search?pretty"
where query.json looks like:
"facets" : {
"contentTypes_stats" : {
"terms_stats" : {
"key_field" : "contentType",
"value_field" : "size", --> the value_field can be "size" (the total space will not include reps) or "sizewithreps" (the total space will include reps)
"size" : 0
"global" : true
This will return the stats for all the content_types in your cluster.
Example response:
"facets" : {
"contentTypes_stats" : {
"_type" : "terms_stats",
"missing" : 0,
"terms" : [ {
"term" : "text/html",
"count" : 56,
"total_count" : 56,
"min" : 29.0,
"max" : 30016.0,
"total" : 112787.0,
"mean" : 2014.0535714285713
}, {
"term" : "image/png",
"count" : 54,
"total_count" : 54,
"min" : 284.0,
"max" : 326408.0,
"total" : 602879.0,
"mean" : 11164.425925925925
The facets[contentTypes_stats][terms][total] represent the total space occupied by objects of that particular content_type (with or without reps included, depending on your query.json)
In case you need to determine the space occupied by the different content_types that you have stored in a particular domain, then you need to follow these steps:
a) Determine the name-id mappings for all the domains in your cluster:
curl -XGET "http://<IP_of_Swarm_node>/?domains&format=json&fields=domainid,name"
Example response:
{"domainid":"f9336c9ceecca321bb6c6408b008d141", "name":"cloudscaler3demo.internal"},
{"domainid":"9dc45e197fc307229d53db5762c5b232", "name":"thesmiths"},
{"domainid":"672f29fd63b90a644206c101888399de", "name":"thebrowns"},
{"domainid":"38c22d3f971d0abd9147399fffd9592f", "name":"theflintstones"},
{"domainid":"c7a7212bc42efbc1fdf4943a6a368efc", "name":"gatewayadmindomain"},
{"domainid":"ad18fffec74600e932a1f16025aba265", "name":"therubbles"}
b) For each domain run the following query against your ElasticSearch Indexer:
curl -XGET -d @query.json "http://<Indexer_IP>:9200/<name_of_SWARM_Cluster>/_search?q=domainid:<ID_of_Swarm_Domain>&pretty"
where query.json is:
"facets" : {
"contentTypes_stats" : {
"terms_stats" : {
"key_field" : "contentType",
"value_field" : "size", --> the value_field can be "size" (the total space will not include reps) or "sizewithreps" (the total space will include reps)
"size" : 0
Example response:
"facets" : {
"contentTypes_stats" : {
"_type" : "terms_stats",
"missing" : 0,
"terms" : [ {
"term" : "application/castorcontext",
"count" : 3,
"total_count" : 3,
"min" : 0.0,
"max" : 0.0,
"total" : 0.0,
"mean" : 0.0
}, {
"term" : "application/test",
"count" : 2,
"total_count" : 2,
"min" : 2.00863744E8,
"max" : 2.00863744E8,
"total" : 4.01727488E8,
"mean" : 2.00863744E8................
As in the previous case, the facets[contentTypes_stats][terms][total] represent the total space occupied by objects within the queried domain, of that particular content_type (with or without reps included, depending on your query.json)