Using the "rclone" command-line tool with Content Gateway S3

The open source command-line tool "rclone" is a fast and stable command-line utility for listing and copying files between storage systems and local file systems. It is also cross-platform, available for Linux, OS X, and Microsoft Windows.

http://rclone.org/
http://linoxide.com/file-system/configure-rclone-linux-sync-cloud/

Download rclone for your platform, unzip, and put the binary in your PATH. http://rclone.org/downloads/

You can skip "rclone config" by using this template.

[caringo]
type=s3
access_key_id=${S3_ACCESS_KEY}
secret_access_key=${S3_SECRET_KEY}
# Must NOT include default port 443 or 80 with rclone >1.47 to avoid signature errors!
endpoint=${S3_PROTOCOL}://${DOMAIN}:${S3_PORT}
location_constraint=
# The default 5MB part size is inefficient
chunk_size=100M
# This --s3-no-check-bucket option breaks mkdir but is required with gateway < 7.1 and rclone > v1.50 to avoid 409 errors (CLOUD-3213).
# no_check_bucket=true

For example, if your S3 domain / endpoint is "https://mydomain.cloud.caringo.com" you can create a token with:

$ curl -i -u johndoe -X POST --data-binary '' -H 'X-User-Secret-Key-Meta: secret' \
     -H 'X-User-Token-Expires-Meta: +90' https://mydomain.cloud.caringo.com/.TOKEN/
HTTP/1.1 201 Created
...
Token c63d5b1034c6b41b119683a5e264abd0 issued for johndoe in [root] with secret secret

Then add this entry to a ~/.rclone.conf (or newer location ~/.config/rclone/rclone.conf) file:

[caringo]
type = s3
# Do NOT use V2 sigs, have seen signature problems.
# region = other-v2-signature
access_key_id = c63d5b1034c6b41b119683a5e264abd0
secret_access_key = secret
endpoint = https://mydomain.cloud.caringo.com
location_constraint=
# The --s3-no-check-bucket option is required with rclone > v1.50 and gateway < 7.1 to avoid 409 errors (CLOUD-3213)!
no_check_bucket=true

If you prefer a GUI client to manage your copy and sync jobs try https://martins.ninja/RcloneBrowser/. Just download the binary from https://github.com/mmozeiko/RcloneBrowser/releases and point it to your rclone.conf. It's very flexible, you can configure any of the below options.

Here are some example commands:

List the buckets in your domain

$ rclone lsd caringo:

          -1 2015-03-16 20:13:52        -1 public
          -1 2015-11-28 23:10:32        -1 inbox

Transferred:            0 Bytes (   0.00 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:  5.653212245s

Copy your Pictures directory (recursively) to a "old-pics" bucket. It will be created if it does not exist.

$ rclone copy --s3-upload-concurrency 10 --s3-chunk-size 100M '/Volumes/Backup/Pictures/' caringo:old-pics

2016/01/12 13:55:47 S3 bucket old-pics: Building file list
2016/01/12 13:55:48 S3 bucket old-pics: Waiting for checks to finish
2016/01/12 13:55:48 S3 bucket old-pics: Waiting for transfers to finish
2016/01/12 13:56:45 
Transferred:      2234563 Bytes (  36.36 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            1
Elapsed time:  1m0.015171105s
Transferring:  histomapwider.jpg
...

List the files in the bucket

$ rclone ls caringo:old-pics
     6148 .DS_Store
  4032165 histomapwider.jpg
  ...

Quickly see the size of the objects in a bucket:

$ rclone size jam:old-pics
Total objects: 173
Total size: 9.550 GBytes (10254108727 Bytes)

Verify all files were uploaded (note trailing slash is necessary on local directory!). The check command can also compare two buckets.

$ rclone check ~/Pictures/test/ caringo:old-pics

2016/01/12 14:01:18 S3 bucket old-pics: Building file list

2016/01/12 14:01:18 S3 bucket old-pics: 1 files not in Local file system at /Users/.../Pictures/test

2016/01/12 14:01:18 .DS_Store: File not in Local file system at /Users/.../Pictures/test

2016/01/12 14:01:18 Local file system at /Users/..../Pictures/test: 0 files not in S3 bucket old-pics

2016/01/12 14:01:18 S3 bucket old-pics: Waiting for checks to finish

2016/01/12 14:01:18 S3 bucket old-pics: 1 differences found

2016/01/12 14:01:18 Failed to check: 1 differences found

Note that "check" appears to be confused by the Mac OS X hidden directory ".DS_Store".

Tips: use "-v" and "--dump headers" or "--dump bodies" to see verbose details.
To ignore system files you don't want compared or uploaded use something like:
--excludes '.DS_Store' --exclude '.Trashes**' --exclude '.fseventsd**' --exclude '.Spotlight**' --exclude '._*'
Increase the part size with --s3-chunk-size 100M (defaults to 5M) to improve the speed and storage efficiency of resulting large streams.
Speed up large transfers with "--transfers=10" and "--s3-upload-concurrency 4".
You might want to use --s3-disable-checksum when uploading huge files.
Unfortunately rclone does not copy or let you add metadata, though there are some enhancement requests on github.
See if using "rclone ls --fast-list caringo:mybucket" speeds up your large bucket listings. This does not use "delimiter" listings, which starting with Gateway 7.6 are much faster than ?delimiter=/ listings.
Copy a file from a plain http website into Swarm by streaming it directly:
- # rclone -v --dump headers copy commondatastorage:gtv-videos-bucket/sample/ElephantsDream.mp4 caringo:mybucket/sample-videos/
  [commondatastorage]
  type = http
  url = https://commondatastorage.googleapis.com
Configure rclone using four environment variables instead of a config file:
```
$ export RCLONE_S3_ENDPOINT=https://support.cloud.caringo.com
# handy one-liner to create an S3 token
$ curl -fsS -u USERNAME -XPOST -H "X-User-Secret-Key-Meta: secret" ${RCLONE_S3_ENDPOINT}/_admin/manage/tenants/caringo/tokens | jq -r ".token,.secret" | { read RCLONE_S3_ACCESS_KEY_ID && read RCLONE_S3_SECRET_ACCESS_KEY && echo "export RCLONE_S3_ACCESS_KEY_ID=${RCLONE_S3_ACCESS_KEY_ID} RCLONE_S3_SECRET_ACCESS_KEY=${RCLONE_S3_SECRET_ACCESS_KEY} RCLONE_CONFIG_MYS3_TYPE=s3" ; } 
```
Enter host password for user 'USERNAME': export RCLONE_S3_ACCESS_KEY_ID=0d4506108b8aa15f784d6ada317abb90 RCLONE_S3_SECRET_ACCESS_KEY=secret RCLONE_CONFIG_MYS3_TYPE=s3
# Copy and paste that output setting the remaining three env variables into your shell
$ export RCLONE_S3_ACCESS_KEY_ID=0d4506108b8aa15f784d6ada317abb90 RCLONE_S3_SECRET_ACCESS_KEY=secret RCLONE_CONFIG_MYS3_TYPE=s3
# Now you can e.g. move all numbered directories 1-100 into Swarm
$ seq 1 100 | sed 's#$#/**#g' > /tmp/xx
$ rclone move -v --include-from /tmp/xx --delete-empty-src-dirs . MYS3:archive/old-builds/
$ seq 1 100 | xargs -n 1 rmdir # rclone only deletes the contents
Mount a bucket as folder in your file system. If you do not use --use-server-modtime rclone will HEAD every object in the bucket, very slow.
$ mkdir /tmp/tickets
$ rclone mount --read-only --use-server-modtime support:tickets /tmp/tickets &
$ ls -lSh /tmp/tickets