Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

If you want to enumerate an entire cluster and you have an Search (Indexer) Feed already configured, you may use the indexer-enumerator.sh script from the support tools bundle to do so.

For a smaller query, it might be easier to use the Content UI portal (if it’s installed on a Content Gateway). This script is for enumerating potentially large data sets where the UI would be less helpful.

Tips

  • You can run the script with “bash -x” to get examples of the curl syntax that you can adapt for your own custom indexer calls.

  • You can search by domain, bucket, prefix, size, date written, and type of object.

  • When you have the match you want, you can remove the -orc options and from there output the object match results to file.

Be careful to run this script from a directory/partition with plenty of disk space if you are returning millions of objects.

For full enumerations of larger data sets, you may want to add the -s option to echo the enumerator loop count. Each call to the indexer has a maximum of 10k returned values, so knowing how many iterations of that 10k figure the script has returned is valuable for larger enumerations.

Instructions

This is an extended example of how you can use this script to investigate what is in your cluster.

The environmental variable SCSP_HOST is set to a storage node IP to avoid having to put -a [storage-node-ip] on every example below.

Listing domains

Run indexer-enumerator.sh -D to find out what domains exist in your cluster.

[root@c-csn1 tmp]# indexer-enumerator.sh -D
A complete domain listing can be found here: ./OUTPUTDIR-2020_0722-124732/domains.txt

Because a domain listing should be short, I use the -or options to output the results to stdout:

[root@c-csn1 tmp]# indexer-enumerator.sh -D -or

Here are the domains:
test1.c-csn1.enfield.com
caringodrive.c-csn1.enfield.com
filefly-c-csn1.enfield.com
c-csn1-test1.enfield.com
c-csn1-admindomain
m-csn4.enfield.com
nfstest1.enfield.com
filefly-s3-target.c-csn1.enfield.com
es-backups.enfield.com
c-csn1.enfield.com
bob.is.great.com
s3-compatible
c-csn1-cfs1.enfield.com
c-csn1-s3-target.enfield.com

Counting objects and space usage

Now I know the domains but not what’s in them. Next, to find out how many objects are in each domain and how much space each takes, I combine the -c option with the -d ALL option:

[root@c-csn1 tmp]# indexer-enumerator.sh -d ALL -c

Enumerating all domains in the cluster:
A complete domain listing can be found here: ./OUTPUTDIR-2020_0722-124949/domains.txt
test1.c-csn1.enfield.com/ has 3147 unique matching objects of stype: all, withreps, uses 458.44MB disk space
caringodrive.c-csn1.enfield.com/ has 20 unique matching objects of stype: all, withreps, uses 156.55MB disk space
filefly-c-csn1.enfield.com/ has 1597 unique matching objects of stype: all, withreps, uses 9.32GB disk space
c-csn1-test1.enfield.com/ has 1114 unique matching objects of stype: all, withreps, uses 971.29MB disk space
c-csn1-admindomain/ has 38 unique matching objects of stype: all, withreps, uses 382.00bytes disk space
m-csn4.enfield.com/ has 8 unique matching objects of stype: all, withreps, uses 184.14MB disk space
nfstest1.enfield.com/ has 19 unique matching objects of stype: all, withreps, uses 13.59MB disk space
filefly-s3-target.c-csn1.enfield.com/ has 8217 unique matching objects of stype: all, withreps, uses 656.16MB disk space
es-backups.enfield.com/ has 41360 unique matching objects of stype: all, withreps, uses 3.69GB disk space
c-csn1.enfield.com/ has 129 unique matching objects of stype: all, withreps, uses 2.12GB disk space
bob.is.great.com/ has 11 unique matching objects of stype: all, withreps, uses 10.81MB disk space
s3-compatible/ has 5 unique matching objects of stype: all, withreps, uses 5.86MB disk space
c-csn1-cfs1.enfield.com/ has 9853 unique matching objects of stype: all, withreps, uses 259.00MB disk space
c-csn1-s3-target.enfield.com/ has 76 unique matching objects of stype: all, withreps, uses 428.23MB disk space


Only streams counts are listed.  To get the streams themselves, remove the -c flag.
All domains: 65594 unique matching objects of stype: all, withreps, uses 18.21GB disk space

This gives me a good idea of what’s in my cluster.

Counting untenanted objects

What it does not show me are the untenanted objects (those not in any domain). Older clusters may not have any domains and so all of the objects would be untenanted. Newer clusters will have most or all objects tenanted and use enforceTenancy=true in the cluster configuration to ensure that all objects are in a domain.

We can see if we have any untenanted objects by using the -t option. I will again use the -c option just to get a count of the number of objects.

[root@c-csn1 tmp]# indexer-enumerator.sh -t -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

Untenanted streams enumerated: 9 unique objects, withreps, uses 101.44KB disk space

By this, I learn that I have only 9 untenanted objects in this particular cluster.

Counting buckets

Going back to the all domains output, I see the c-csn1-test1.enfield.com domain looks interesting to me because the domain name doesn’t give me a good idea what’s in it (in the way that the filefly-c-csn1.enfield.com and es-backups.enfield.com do).

So, let’s drill down into that domain by using the -d c-csn1-test1.enfield.com option.

How many buckets live in here?

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -B -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 20 unique objects of stype: bucket, withreps, uses 0 disk space.

There appear to be 20 buckets here, and they seem to use no disk space. That’s because I asked for only bucket objects, which don’t take up data. To see how much data resides inside a particular bucket, I would need to do a query on that bucket. Also, there might be unnamed objects that live in this domain (that is, are named by UUID and do not live in a bucket).

Let’s see what buckets exist in this domain (not just count them, as we did above):

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -B

Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com

c-csn1-test1.enfield.com/.TOKEN
c-csn1-test1.enfield.com/Bucket15917374547579_0
c-csn1-test1.enfield.com/Bucket15917374547579_1
c-csn1-test1.enfield.com/Bucket15917374547579_2
c-csn1-test1.enfield.com/Bucket15917374547579_3
c-csn1-test1.enfield.com/Bucket15917374547579_4
c-csn1-test1.enfield.com/Bucket15917374547579_5
c-csn1-test1.enfield.com/Bucket15917374547579_6
c-csn1-test1.enfield.com/Bucket15917374547579_7
c-csn1-test1.enfield.com/Bucket15917374547579_8
c-csn1-test1.enfield.com/Bucket15917374547579_9
c-csn1-test1.enfield.com/Bucket15917383799242_0
c-csn1-test1.enfield.com/Bucket15917383799242_1
c-csn1-test1.enfield.com/Bucket15917383799242_2
c-csn1-test1.enfield.com/Bucket15917383799242_3
c-csn1-test1.enfield.com/Bucket15917383799242_4
c-csn1-test1.enfield.com/pants
c-csn1-test1.enfield.com/10kbuckettest
c-csn1-test1.enfield.com/superpants
c-csn1-test1.enfield.com/20200622

c-csn1-test1.enfield.com/ has 20 unique objects of stype: bucket, withreps, uses 0 disk space.

Searching objects

I see that I have a bucket named “pants”. Let’s see how many objects live in my pants bucket.

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -b pants -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/pants has 3 unique objects of stype: all, withreps, uses 11.83KB disk space.

As there are only three, I will output them to stdout (keeping the -or flags and removing the -c flag):

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -b pants

Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com

c-csn1-test1.enfield.com/pants/vimium-options.json
c-csn1-test1.enfield.com/pants/vimium-options-2020-mbp.json
c-csn1-test1.enfield.com/pants/plugins.txt

c-csn1-test1.enfield.com/pants has 3 unique objects of stype: all, withreps, uses 11.83KB disk space. 

I keep using the -c option because I could potentially make a query that returns millions if not billions of results. Certainly I don’t want to do that right now. From the above, I see that I have 3 files in that bucket.

Because that domain should be an FQDN that resolves to the Content Gateway (or Swarm cluster, if not using Gateway), I can just curl info any of these files to see more information:

[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/pants/plugins.txt -ucaringoadmin:caringo
HTTP/1.1 200 OK
Date: Wed, 22 Jul 2020 18:28:05 GMT
Gateway-Request-Id: ED6BE75CEE440295
Server: CAStor Cluster/11.2.0
Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0)
Gateway-Protocol: scsp
Castor-System-CID: 15a648db93dc29a6819bb256643915fc
Castor-System-Cluster: c-csn1.enfield.com
Castor-System-Created: Fri, 19 Jun 2020 21:31:53 GMT
Castor-System-Name: plugins.txt
Castor-System-Version: 1592602313.352
Content-Type: application/x-www-form-urlencoded
Last-Modified: Fri, 19 Jun 2020 21:31:53 GMT
X-Last-Modified-By-Meta: acepelon@
X-Owner-Meta: acepelon
ETag: "f877345eb91e9b72ad44d2a4480af33c"
Castor-System-Path: /c-csn1-test1.enfield.com/pants/plugins.txt
Castor-System-Domain: c-csn1-test1.enfield.com
Volume: 53a22d293eea60eb4bfaacc9933f12d6
Content-MD5: Li8xabfpx+wi+MMZFE3Uqg==
Content-Length: 616

I can see that I wrote this object recently, and there aren’t many objects in this bucket. I am going to poke around more.

Searching unnamed objects

So, if that bucket doesn’t contain a majority of the objects in my domain, what bucket does? Or, perhaps unnamed objects are the majority of my objects. We can search only for unnamed objects like so using the -u option:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -u -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 79 unique objects of stype: unnamed, withreps, uses 80.33KB disk space.

We might be tempted to use the -t option for “untenanted” objects, because untenanted objects are always unnamed, but these objects ARE tenanted (meaning, they live in a domain) but are also unnamed. Therefore, using -d [domain] -t will error.

Ok, we have 79 unnamed objects that live in c-csn1-test1.enfield.com. I want to get a few examples of these to show you what unnamed objects in a domain look like, but I don’t want to output all 79 to stdout. I will use the -u -1 -M 5 options to say “only send a single request for results (-1) and only return 5 items (-5) in that single request, and only return unnamed (-u) objects:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -u -1 -M 5

Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com

c-csn1-test1.enfield.com/7f7c9ecde7f265ac7dd4ba81e4388540
c-csn1-test1.enfield.com/f5b214774783d8bcc91ceae67c50a080
c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054
c-csn1-test1.enfield.com/6fe0dbcdbf8bc538250f655dd152b5fd
c-csn1-test1.enfield.com/3122faaa7d02f9f7438702bf6bedb6ff


c-csn1-test1.enfield.com/ has 79 unique objects of stype: unnamed, withreps, uses 80.33KB disk space.

I can now curl any of these objects if I wanted to see their headers:

[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/3122faaa7d02f9f7438702bf6bedb6ff -ucaringoadmin:caringo
HTTP/1.1 200 OK
Date: Wed, 22 Jul 2020 18:42:48 GMT
Gateway-Request-Id: FE7629C67527A767
Server: CAStor Cluster/11.2.0
Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0)
Gateway-Protocol: scsp
Castor-System-CID: 21876415934a554d1072804cfc776e10
Castor-System-Cluster: c-csn1.enfield.com
Castor-System-Created: Tue, 23 Jun 2020 15:07:45 GMT
Content-Type: application/x-www-form-urlencoded
Last-Modified: Tue, 23 Jun 2020 15:07:45 GMT
X-Last-Modified-By-Meta:
X-Owner-Meta:
x-bob-meta-apples: dunkin
ETag: "3122faaa7d02f9f7438702bf6bedb6ff"
Castor-System-Domain: c-csn1-test1.enfield.com
Volume: fa52b18e98d6164c5c0b700bba9652bb
Content-MD5: Ep4TEA3HwH8cOehCM1zZIQ==
Content-Length: 412

Searching metadata

Notice I have a metadata header called “x-bob-meta-apples” with value of “dunkin”. That’s interesting to me. I wonder if I have that metadata elsewhere in this domain:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space.

The -m and -v options together show me that indeed I do have 43 matching objects. I wonder if I have any other objects that match the header but not necessarily that value. For this test, I simply remove the -v dunkin part of the command:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples  -c

Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 78 unique objects of stype: all, withreps, uses 3.46MB disk space.

Since I have more results here, I know that I have that header with a different value.

Searching across multiple domains

One of the more powerful things about the indexer-enumerator.sh is that I can search across all domains, not just one domain. Let’s see how many objects matching that metadata header I have across my whole cluster. For this query, I change the domain name to “ALL” and I am just going to get a count match by using the -c option again:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -c

Enumerating all domains in the cluster:

Here are the domains:
test1.c-csn1.enfield.com
caringodrive.c-csn1.enfield.com
filefly-c-csn1.enfield.com
c-csn1-test1.enfield.com
c-csn1-admindomain
m-csn4.enfield.com
nfstest1.enfield.com
filefly-s3-target.c-csn1.enfield.com
es-backups.enfield.com
c-csn1.enfield.com
bob.is.great.com
s3-compatible
c-csn1-cfs1.enfield.com
c-csn1-s3-target.enfield.com

test1.c-csn1.enfield.com/ has 7 unique matching objects of stype: all, withreps, uses 7.32KB disk space
caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-c-csn1.enfield.com/ has 42 unique matching objects of stype: all, withreps, uses 41.67KB disk space
c-csn1-test1.enfield.com/ has 78 unique matching objects of stype: all, withreps, uses 3.46MB disk space
c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
bob.is.great.com/ has 6 unique matching objects of stype: all, withreps, uses 10.81MB disk space
s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space


Only streams counts are listed.  To get the streams themselves, remove the -c flag.
All domains: 133 unique matching objects of stype: all, withreps, uses 14.32MB disk space

That shows me 4 different domains (although it doesn’t show me untenanted objects that may match) have objects with that metadata. I can then narrow the search down to match that particular header value “dunkin”:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v dunkin -c

Enumerating all domains in the cluster:

Here are the domains:
test1.c-csn1.enfield.com
caringodrive.c-csn1.enfield.com
filefly-c-csn1.enfield.com
c-csn1-test1.enfield.com
c-csn1-admindomain
m-csn4.enfield.com
nfstest1.enfield.com
filefly-s3-target.c-csn1.enfield.com
es-backups.enfield.com
c-csn1.enfield.com
bob.is.great.com
s3-compatible
c-csn1-cfs1.enfield.com
c-csn1-s3-target.enfield.com

test1.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-c-csn1.enfield.com/ has 14 unique matching objects of stype: all, withreps, uses 14.64KB disk space
c-csn1-test1.enfield.com/ has 43 unique matching objects of stype: all, withreps, uses 3.42MB disk space
c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
bob.is.great.com/ has 3 unique matching objects of stype: all, withreps, uses 5.40MB disk space
s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space


Only streams counts are listed.  To get the streams themselves, remove the -c flag.
All domains: 60 unique matching objects of stype: all, withreps, uses 8.84MB disk space

73 fewer objects. Let’s try a different header value:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v donuts -c

Enumerating all domains in the cluster:

Here are the domains:
test1.c-csn1.enfield.com
caringodrive.c-csn1.enfield.com
filefly-c-csn1.enfield.com
c-csn1-test1.enfield.com
c-csn1-admindomain
m-csn4.enfield.com
nfstest1.enfield.com
filefly-s3-target.c-csn1.enfield.com
es-backups.enfield.com
c-csn1.enfield.com
bob.is.great.com
s3-compatible
c-csn1-cfs1.enfield.com
c-csn1-s3-target.enfield.com

test1.c-csn1.enfield.com/ has 7 unique matching objects of stype: all, withreps, uses 7.32KB disk space
caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-c-csn1.enfield.com/ has 28 unique matching objects of stype: all, withreps, uses 27.02KB disk space
c-csn1-test1.enfield.com/ has 35 unique matching objects of stype: all, withreps, uses 36.62KB disk space
c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
bob.is.great.com/ has 3 unique matching objects of stype: all, withreps, uses 5.40MB disk space
s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space


Only streams counts are listed.  To get the streams themselves, remove the -c flag.
All domains: 73 unique matching objects of stype: all, withreps, uses 5.47MB disk space

Ah! This shows me that all of the objects matching that header have a value of either “dunkin” or “donuts”.

Searching by age

What if I was only interested in objects written long ago? Maybe I want to find all objects written x days ago so that I can delete them…

Let’s get a single object from the matching output above and then do a curl INFO.

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -1 -M 1

Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com

c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054


c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space.
[root@c-csn1 tmp]# curl -IL c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 -ucaringoadmin:caringo
HTTP/1.1 200 OK
Date: Wed, 22 Jul 2020 18:57:27 GMT
Gateway-Request-Id: 58E8B3631B9490E0
Server: CAStor Cluster/11.2.0
Via: 1.1 c-csn1-test1.enfield.com (Cloud Gateway SCSP/6.4.0)
Gateway-Protocol: scsp
Castor-System-CID: 21876415934a554d1072804cfc776e10
Castor-System-Cluster: c-csn1.enfield.com
Castor-System-Created: Tue, 23 Jun 2020 15:07:45 GMT
Content-Type: application/x-www-form-urlencoded
Last-Modified: Tue, 23 Jun 2020 15:07:45 GMT
X-Last-Modified-By-Meta:
X-Owner-Meta:
x-bob-meta-apples: dunkin
ETag: "e0896cec233e382c17840ae1c7d92054"
Castor-System-Domain: c-csn1-test1.enfield.com
Volume: fa52b18e98d6164c5c0b700bba9652bb
Content-MD5: 6AspDUv0/7hEBsMFALI5Ig==
Content-Length: 858

I can see that it was written on June 23 of this year. Were ALL of the objects written this year matching that header written this year? We can check by using the -G 1 and -g 1 options:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -G 1 -c

Only enumerating streams written since 1 year(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 43 unique objects of stype: all, withreps, uses 3.42MB disk space.

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -g 1 -c

Only enumerating streams written at least 1 year(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 0 unique objects of stype: all, withreps, uses 0 disk space.

Yes, they were all written this year. Since the object example we had was written on June 23 (today is July 22), I can do some further narrowing down based on my example. June 23 was 29 days ago from when I am running these examples:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -c

Only enumerating streams written at least 29 day(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 14 unique objects of stype: all, withreps, uses 14.64KB disk space.

14 objects matched that, and our test object in particular you can see matches as expected:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 | grep e08
c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054 

But only 14 of 43 objects were written at least 29 days ago. Were any written more than 30 days ago?

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 30 -c

Only enumerating streams written at least 30 day(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 0 unique objects of stype: all, withreps, uses 0 disk space.

Nope.

I can further winnow my results as desired.

Let’s back out a little bit and add another option.

Search by size

How about if we were looking for small files with that same metadata. Let’s try to match objects about the same size as our example above - which was 858 bytes. To that end, I will add the -l 859 (l for “littler”) option to our query.

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -l 859 -c

Only streams smaller than 859 bytes are listed.
Only enumerating streams written at least 29 day(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.

c-csn1-test1.enfield.com/ has 12 unique objects of stype: all, withreps, uses 10.11KB disk space.

Ok, 12 of the 14 objects were smaller than 858 bytes. Nice to know. I want to verify that my example object is in that result set as a sanity check. The UUID started with e, so let’s use yet another option- the prefix match. I will add -p e to match any object starting with “p” in its name/UUID. I will remove the -c option so that I am actually seeing the match:

[root@c-csn1 tmp]# indexer-enumerator.sh -ro -d c-csn1-test1.enfield.com -m x-bob-meta-apples -v dunkin -f 29 -l 859 -p e

Starting to enumerate the requested streams in domain: c-csn1-test1.enfield.com

c-csn1-test1.enfield.com/e0896cec233e382c17840ae1c7d92054

Only streams smaller than 859 bytes are listed.
Only streams with names (or UUIDs) starting with "e" are listed.
Only enumerating streams written at least 29 day(s) ago

c-csn1-test1.enfield.com/ has 1 unique objects of stype: all, withreps, uses 1.67KB disk space.
[root@c-csn1 tmp]#

Sure enough, there’s our object!

Now, how about if I want to match all objects larger than that object across all domains, matching that same header, written more than 29 days ago. I will use the capital L option and change the domain to “ALL”:

 [root@c-csn1 tmp]# indexer-enumerator.sh -ro -d ALL -m x-bob-meta-apples -v dunkin -f 29 -L 859 -c

Enumerating all domains in the cluster:

Here are the domains:
test1.c-csn1.enfield.com
caringodrive.c-csn1.enfield.com
filefly-c-csn1.enfield.com
c-csn1-test1.enfield.com
c-csn1-admindomain
m-csn4.enfield.com
nfstest1.enfield.com
filefly-s3-target.c-csn1.enfield.com
es-backups.enfield.com
c-csn1.enfield.com
bob.is.great.com
s3-compatible
c-csn1-cfs1.enfield.com
c-csn1-s3-target.enfield.com

test1.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
caringodrive.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-test1.enfield.com/ has 2 unique matching objects of stype: all, withreps, uses 4.53KB disk space
c-csn1-admindomain/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
m-csn4.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
nfstest1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
filefly-s3-target.c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
es-backups.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
bob.is.great.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
s3-compatible/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-cfs1.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space
c-csn1-s3-target.enfield.com/ has 0 unique matching objects of stype: all, withreps, uses 0 disk space


Only streams larger than 859 bytes are listed.
Only enumerating streams written at least 29 day(s) ago
Only streams counts are listed.  To get the streams themselves, remove the -c flag.
All domains: 2 unique matching objects of stype: all, withreps, uses 4.53KB disk space

I can see only 2 objects match. I can remove the -c option and get those results if I wanted.

Hopefully the above gives you a good understanding of how the indexer-enumerator.sh script works and the power of its flexibility.

  • No labels