If you want to enumerate an entire cluster and you have an Search (Indexer) Feed already configured, you may use the indexer-enumerator.sh
script from the support tools bundle to do so.
For a smaller query, it might be easier to use the Content UI portal (if it’s installed on a Content Gateway). This script is for enumerating potentially large data sets where the UI would be less helpful.
Tips
You can run the script with “
bash -x
” to get examples of the curl syntax that you can adapt for your own custom indexer calls.You can search by domain, bucket, prefix, size, date written, and type of object.
When you have the match you want, you can remove the
-orc
options and from there output the object match results to file.
Be careful to run this script from a directory/partition with plenty of disk space if you are returning millions of objects.
For full enumerations of larger data sets, you may want to add the -s
option to echo the enumerator loop count. Each call to the indexer has a maximum of 10k returned values, so knowing how many iterations of that 10k figure the script has returned is valuable for larger enumerations.
Instructions
This is an extended example of how you can use this script to investigate what is in your cluster.
The environmental variable SCSP_HOST is set to a storage node IP to avoid having to put -a [storage-node-ip]
on every example below.
Listing domains
Run indexer-enumerator.sh -D
to find out what domains exist in your cluster.
[root@c-csn1 tmp]# indexer-enumerator.sh -D A complete domain listing can be found here: ./OUTPUTDIR-2020_0722-124732/domains.txt
Because a domain listing should be short, I use the -or
options to output the results to stdout:
[root@c-csn1 tmp]# indexer-enumerator.sh -D -or Here are the domains: test1.c-csn1.enfield.com caringodrive.c-csn1.enfield.com filefly-c-csn1.enfield.com c-csn1-test1.enfield.com c-csn1-admindomain m-csn4.enfield.com nfstest1.enfield.com filefly-s3-target.c-csn1.enfield.com es-backups.enfield.com c-csn1.enfield.com bob.is.great.com s3-compatible c-csn1-cfs1.enfield.com c-csn1-s3-target.enfield.com
Counting objects and space usage
Now I know the domains but not what’s in them. Next, to find out how many objects are in each domain and how much space each takes, I combine the -c
option with the -d ALL
option:
[root@c-csn1 tmp]# indexer-enumerator.sh -d ALL -c Enumerating all domains in the cluster: A complete domain listing can be found here: ./OUTPUTDIR-2020_0722-124949/domains.txt test1.c-csn1.enfield.com/ has 3147 unique matching objects of stype: all, withreps, uses 458.44MB disk space caringodrive.c-csn1.enfield.com/ has 20 unique matching objects of stype: all, withreps, uses 156.55MB disk space filefly-c-csn1.enfield.com/ has 1597 unique matching objects of stype: all, withreps, uses 9.32GB disk space c-csn1-test1.enfield.com/ has 1114 unique matching objects of stype: all, withreps, uses 971.29MB disk space c-csn1-admindomain/ has 38 unique matching objects of stype: all, withreps, uses 382.00bytes disk space m-csn4.enfield.com/ has 8 unique matching objects of stype: all, withreps, uses 184.14MB disk space nfstest1.enfield.com/ has 19 unique matching objects of stype: all, withreps, uses 13.59MB disk space filefly-s3-target.c-csn1.enfield.com/ has 8217 unique matching objects of stype: all, withreps, uses 656.16MB disk space es-backups.enfield.com/ has 41360 unique matching objects of stype: all, withreps, uses 3.69GB disk space c-csn1.enfield.com/ has 129 unique matching objects of stype: all, withreps, uses 2.12GB disk space bob.is.great.com/ has 11 unique matching objects of stype: all, withreps, uses 10.81MB disk space s3-compatible/ has 5 unique matching objects of stype: all, withreps, uses 5.86MB disk space c-csn1-cfs1.enfield.com/ has 9853 unique matching objects of stype: all, withreps, uses 259.00MB disk space c-csn1-s3-target.enfield.com/ has 76 unique matching objects of stype: all, withreps, uses 428.23MB disk space Only streams counts are listed. To get the streams themselves, remove the -c flag. All domains: 65594 unique matching objects of stype: all, withreps, uses 18.21GB disk space
This gives me a good idea of what’s in my cluster.
Counting untenanted objects
What it does not show me are the untenanted objects (those not in any domain). Older clusters may not have any domains and so all of the objects would be untenanted. Newer clusters will have most or all objects tenanted and use enforceTenancy=true
in the cluster configuration to ensure that all objects are in a domain.
We can see if we have any untenanted objects by using the -t
option. I will again use the -c
option just to get a count of the number of objects.
[root@c-csn1 tmp]# indexer-enumerator.sh -t -c Only streams counts are listed. To get the streams themselves, remove the -c flag. Untenanted streams enumerated: 9 unique objects, withreps, uses 101.44KB disk space
By this, I learn that I have only 9 untenanted objects in this particular cluster.
Counting buckets
Going back to the all domains output, I see the c-csn1-test1.enfield.com
domain looks interesting to me because the domain name doesn’t give me a good idea what’s in it (in the way that the filefly-c-csn1.enfield.com
and es-backups.enfield.com
do).
So, let’s drill down into that domain by using the -d c-csn1-test1.enfield.com
option.
How many buckets live in here?
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -B -c Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 20 unique objects of stype: bucket, withreps, uses 0 disk space.
There appear to be 20 buckets here, and they seem to use no disk space. That’s because I asked for only bucket objects, which don’t take up data. To see how much data resides inside a particular bucket, I would need to do a query on that bucket. Also, there might be unnamed objects that live in this domain (that is, are named by UUID and do not live in a bucket).
Let’s see what buckets exist in this domain (not just count them, as we did above):
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -B Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com c-csn1-test1.enfield.com/.TOKEN c-csn1-test1.enfield.com/Bucket15917374547579_0 c-csn1-test1.enfield.com/Bucket15917374547579_1 c-csn1-test1.enfield.com/Bucket15917374547579_2 c-csn1-test1.enfield.com/Bucket15917374547579_3 c-csn1-test1.enfield.com/Bucket15917374547579_4 c-csn1-test1.enfield.com/Bucket15917374547579_5 c-csn1-test1.enfield.com/Bucket15917374547579_6 c-csn1-test1.enfield.com/Bucket15917374547579_7 c-csn1-test1.enfield.com/Bucket15917374547579_8 c-csn1-test1.enfield.com/Bucket15917374547579_9 c-csn1-test1.enfield.com/Bucket15917383799242_0 c-csn1-test1.enfield.com/Bucket15917383799242_1 c-csn1-test1.enfield.com/Bucket15917383799242_2 c-csn1-test1.enfield.com/Bucket15917383799242_3 c-csn1-test1.enfield.com/Bucket15917383799242_4 c-csn1-test1.enfield.com/pants c-csn1-test1.enfield.com/10kbuckettest c-csn1-test1.enfield.com/superpants c-csn1-test1.enfield.com/20200622 c-csn1-test1.enfield.com/ has 20 unique objects of stype: bucket, withreps, uses 0 disk space.
Searching objects
I see that I have a bucket named “pants”. Let’s see how many objects live in my pants
bucket.
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -b pants -c Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/pants has 3 unique objects of stype: all, withreps, uses 11.83KB disk space.
As there are only three, I will output them to stdout (keeping the -or
flags and removing the -c
flag):
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -b pants Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com c-csn1-test1.enfield.com/pants/vimium-options.json c-csn1-test1.enfield.com/pants/vimium-options-2020-mbp.json c-csn1-test1.enfield.com/pants/plugins.txt c-csn1-test1.enfield.com/pants has 3 unique objects of stype: all, withreps, uses 11.83KB disk space.
I keep using the -c
option because I could potentially make a query that returns millions if not billions of results. Certainly I don’t want to do that right now. From the above, I see that I have 3 files in that bucket.
Because that domain should be an FQDN that resolves to the Content Gateway (or Swarm cluster, if not using Gateway), I can just curl info any of these files to see more information:
[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/pants/plugins.txt -ucaringoadmin:caringo HTTP/1.1 200 OK Date: Wed, 22 Jul 2020 18:28:05 GMT Gateway-Request-Id: ED6BE75CEE440295 Server: CAStor Cluster/11.2.0 Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0) Gateway-Protocol: scsp Castor-System-CID: 15a648db93dc29a6819bb256643915fc Castor-System-Cluster: c-csn1.enfield.com Castor-System-Created: Fri, 19 Jun 2020 21:31:53 GMT Castor-System-Name: plugins.txt Castor-System-Version: 1592602313.352 Content-Type: application/x-www-form-urlencoded Last-Modified: Fri, 19 Jun 2020 21:31:53 GMT X-Last-Modified-By-Meta: acepelon@ X-Owner-Meta: acepelon ETag: "f877345eb91e9b72ad44d2a4480af33c" Castor-System-Path: /c-csn1-test1.enfield.com/pants/plugins.txt Castor-System-Domain: c-csn1-test1.enfield.com Volume: 53a22d293eea60eb4bfaacc9933f12d6 Content-MD5: Li8xabfpx+wi+MMZFE3Uqg== Content-Length: 616
I can see that I wrote this object recently, and there aren’t many objects in this bucket. I am going to poke around more.
Searching unnamed objects
So, if that bucket doesn’t contain a majority of the objects in my domain, what bucket does? Or, perhaps unnamed objects are the majority of my objects. We can search only for unnamed objects like so using the -u
option:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -u -c Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 79 unique objects of stype: unnamed, withreps, uses 80.33KB disk space.
We might be tempted to use the -t option for “untenanted” objects, because untenanted objects are always unnamed, but these objects ARE tenanted (meaning, they live in a domain) but are also unnamed. Therefore, using -d [domain] -t will error.
Ok, we have 79 unnamed objects that live in c-csn1-test1.enfield.com. I want to get a few examples of these to show you what unnamed objects in a domain look like, but I don’t want to output all 79 to stdout. I will use the -u -1 -M 5
options to say “only send a single request for results (-1) and only return 5 items (-5) in that single request, and only return unnamed (-u) objects:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -u -1 -M 5 Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com c-csn1-test1.enfield.com/7f7c9ecde7f265ac7dd4ba81e4388540 c-csn1-test1.enfield.com/f5b214774783d8bcc91ceae67c50a080 c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 c-csn1-test1.enfield.com/6fe0dbcdbf8bc538250f655dd152b5fd c-csn1-test1.enfield.com/3122faaa7d02f9f7438702bf6bedb6ff c-csn1-test1.enfield.com/ has 79 unique objects of stype: unnamed, withreps, uses 80.33KB disk space.
I can now curl any of these objects if I wanted to see their headers:
[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/3122faaa7d02f9f7438702bf6bedb6ff -ucaringoadmin:caringo HTTP/1.1 200 OK Date: Wed, 22 Jul 2020 18:42:48 GMT Gateway-Request-Id: FE7629C67527A767 Server: CAStor Cluster/11.2.0 Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0) Gateway-Protocol: scsp Castor-System-CID: 21876415934a554d1072804cfc776e10 Castor-System-Cluster: c-csn1.enfield.com Castor-System-Created: Tue, 23 Jun 2020 15:07:45 GMT Content-Type: application/x-www-form-urlencoded Last-Modified: Tue, 23 Jun 2020 15:07:45 GMT X-Last-Modified-By-Meta: X-Owner-Meta: x-bob-meta-apples: dunkin ETag: "3122faaa7d02f9f7438702bf6bedb6ff" Castor-System-Domain: c-csn1-test1.enfield.com Volume: fa52b18e98d6164c5c0b700bba9652bb Content-MD5: Ep4TEA3HwH8cOehCM1zZIQ== Content-Length: 412
Searching metadata
Notice I have a metadata header called “x-bob-meta-apples” with value of “dunkin”. That’s interesting to me. I wonder if I have that metadata elsewhere in this domain:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -c Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space.
The -m and -v options together show me that indeed I do have 43 matching objects. I wonder if I have any other objects that match the header but not necessarily that value. For this test, I simply remove the -v dunkin
part of the command:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -c Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 78 unique objects of stype: all, withreps, uses 3.46MB disk space.
Since I have more results here, I know that I have that header with a different value.
Searching across multiple domains
One of the more powerful things about the indexer-enumerator.sh is that I can search across all domains, not just one domain. Let’s see how many objects matching that metadata header I have across my whole cluster. For this query, I change the domain name to “ALL” and I am just going to get a count match by using the -c option again:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -c Enumerating all domains in the cluster: Here are the domains: test1.c-csn1.enfield.com caringodrive.c-csn1.enfield.com filefly-c-csn1.enfield.com c-csn1-test1.enfield.com c-csn1-admindomain m-csn4.enfield.com nfstest1.enfield.com filefly-s3-target.c-csn1.enfield.com es-backups.enfield.com c-csn1.enfield.com bob.is.great.com s3-compatible c-csn1-cfs1.enfield.com c-csn1-s3-target.enfield.com test1.c-csn1.enfield.com/ has 7 unique matching objects of stype: all, withreps, uses 7.32KB disk space caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-c-csn1.enfield.com/ has 42 unique matching objects of stype: all, withreps, uses 41.67KB disk space c-csn1-test1.enfield.com/ has 78 unique matching objects of stype: all, withreps, uses 3.46MB disk space c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space bob.is.great.com/ has 6 unique matching objects of stype: all, withreps, uses 10.81MB disk space s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space Only streams counts are listed. To get the streams themselves, remove the -c flag. All domains: 133 unique matching objects of stype: all, withreps, uses 14.32MB disk space
That shows me 4 different domains (although it doesn’t show me untenanted objects that may match) have objects with that metadata. I can then narrow the search down to match that particular header value “dunkin”:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v dunkin -c Enumerating all domains in the cluster: Here are the domains: test1.c-csn1.enfield.com caringodrive.c-csn1.enfield.com filefly-c-csn1.enfield.com c-csn1-test1.enfield.com c-csn1-admindomain m-csn4.enfield.com nfstest1.enfield.com filefly-s3-target.c-csn1.enfield.com es-backups.enfield.com c-csn1.enfield.com bob.is.great.com s3-compatible c-csn1-cfs1.enfield.com c-csn1-s3-target.enfield.com test1.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-c-csn1.enfield.com/ has 14 unique matching objects of stype: all, withreps, uses 14.64KB disk space c-csn1-test1.enfield.com/ has 43 unique matching objects of stype: all, withreps, uses 3.42MB disk space c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space bob.is.great.com/ has 3 unique matching objects of stype: all, withreps, uses 5.40MB disk space s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space Only streams counts are listed. To get the streams themselves, remove the -c flag. All domains: 60 unique matching objects of stype: all, withreps, uses 8.84MB disk space
73 fewer objects. Let’s try a different header value:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v donuts -c Enumerating all domains in the cluster: Here are the domains: test1.c-csn1.enfield.com caringodrive.c-csn1.enfield.com filefly-c-csn1.enfield.com c-csn1-test1.enfield.com c-csn1-admindomain m-csn4.enfield.com nfstest1.enfield.com filefly-s3-target.c-csn1.enfield.com es-backups.enfield.com c-csn1.enfield.com bob.is.great.com s3-compatible c-csn1-cfs1.enfield.com c-csn1-s3-target.enfield.com test1.c-csn1.enfield.com/ has 7 unique matching objects of stype: all, withreps, uses 7.32KB disk space caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-c-csn1.enfield.com/ has 28 unique matching objects of stype: all, withreps, uses 27.02KB disk space c-csn1-test1.enfield.com/ has 35 unique matching objects of stype: all, withreps, uses 36.62KB disk space c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space bob.is.great.com/ has 3 unique matching objects of stype: all, withreps, uses 5.40MB disk space s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space Only streams counts are listed. To get the streams themselves, remove the -c flag. All domains: 73 unique matching objects of stype: all, withreps, uses 5.47MB disk space
Ah! This shows me that all of the objects matching that header have a value of either “dunkin” or “donuts”.
Searching by age
What if I was only interested in objects written long ago? Maybe I want to find all objects written x days ago so that I can delete them…
Let’s get a single object from the matching output above and then do a curl INFO.
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -1 -M 1 Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space.
[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 -ucaringoadmin:caringo HTTP/1.1 200 OK Date: Wed, 22 Jul 2020 18:57:27 GMT Gateway-Request-Id: 58E8B3631B9490E0 Server: CAStor Cluster/11.2.0 Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0) Gateway-Protocol: scsp Castor-System-CID: 21876415934a554d1072804cfc776e10 Castor-System-Cluster: c-csn1.enfield.com Castor-System-Created: Tue, 23 Jun 2020 15:07:45 GMT Content-Type: application/x-www-form-urlencoded Last-Modified: Tue, 23 Jun 2020 15:07:45 GMT X-Last-Modified-By-Meta: X-Owner-Meta: x-bob-meta-apples: dunkin ETag: "e0896cec233e382c17840ae1c7d92054" Castor-System-Domain: c-csn1-test1.enfield.com Volume: fa52b18e98d6164c5c0b700bba9652bb Content-MD5: 6AspDUv0/7hEBsMFALI5Ig== Content-Length: 858
I can see that it was written on June 23 of this year. Were ALL of the objects written this year matching that header written this year? We can check by using the -G 1
and -g 1
options:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -G 1 -c Only enumerating streams written since 1 year(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space. [root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -g 1 -c Only enumerating streams written at least 1 year(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 0 unique objects of stype: all, withreps, uses 0 disk space.
Yes, they were all written this year. Since the object example we had was written on June 23 (today is July 22), I can do some further narrowing down based on my example. June 23 was 29 days ago from when I am running these examples:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -c Only enumerating streams written at least 29 day(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 14 unique objects of stype: all, withreps, uses 14.64KB disk space.
14 objects matched that, and our test object in particular you can see matches as expected:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 | grep e08 c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054
But only 14 of 43 objects were written at least 29 days ago. Were any written more than 30 days ago?
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 30 -c Only enumerating streams written at least 30 day(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 0 unique objects of stype: all, withreps, uses 0 disk space.
Nope.
I can further winnow my results as desired.
Let’s back out a little bit and add another option.
Search by size
How about if we were looking for small files with that same metadata. Let’s try to match objects about the same size as our example above - which was 858 bytes. To that end, I will add the -l 859
(l for “littler”) option to our query.
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -l 859 -c Only streams smaller than 859 bytes are listed. Only enumerating streams written at least 29 day(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. c-csn1-test1.enfield.com/ has 12 unique objects of stype: all, withreps, uses 10.11KB disk space.
Ok, 12 of the 14 objects were smaller than 858 bytes. Nice to know. I want to verify that my example object is in that result set as a sanity check. The UUID started with e, so let’s use yet another option- the prefix match. I will add -p e
to match any object starting with “p” in its name/UUID. I will remove the -c
option so that I am actually seeing the match:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -l 859 -p e Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 Only streams smaller than 859 bytes are listed. Only streams with names (or UUIDs) starting with "e" are listed. Only enumerating streams written at least 29 day(s) ago c-csn1-test1.enfield.com/ has 1 unique objects of stype: all, withreps, uses 1.67KB disk space. [root@c-csn1 tmp]#
Sure enough, there’s our object!
Now, how about if I want to match all objects larger than that object across all domains, matching that same header, written more than 29 days ago. I will use the capital L option and change the domain to “ALL”:
[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v dunkin -f 29 -L 859 -c Enumerating all domains in the cluster: Here are the domains: test1.c-csn1.enfield.com caringodrive.c-csn1.enfield.com filefly-c-csn1.enfield.com c-csn1-test1.enfield.com c-csn1-admindomain m-csn4.enfield.com nfstest1.enfield.com filefly-s3-target.c-csn1.enfield.com es-backups.enfield.com c-csn1.enfield.com bob.is.great.com s3-compatible c-csn1-cfs1.enfield.com c-csn1-s3-target.enfield.com test1.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-test1.enfield.com/ has 2 unique matching objects of stype: all, withreps, uses 4.53KB disk space c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space bob.is.great.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space Only streams larger than 859 bytes are listed. Only enumerating streams written at least 29 day(s) ago Only streams counts are listed. To get the streams themselves, remove the -c flag. All domains: 2 unique matching objects of stype: all, withreps, uses 4.53KB disk space
I can see only 2 objects match. I can remove the -c option and get those results if I wanted.
Hopefully the above gives you a good understanding of how the indexer-enumerator.sh script works and the power of its flexibility.