The open source command-line tool "rclone" is a fast and stable command-line utility for listing and copying files between storage systems and local file systems. It is also cross-platform, available for Linux, OS X, and Microsoft Windows.
http://rclone.org/
http://linoxide.com/file-system/configure-rclone-linux-sync-cloud/
Info |
---|
rclone v1.55.1 has known issues specific to object versioning and v1.57 (the default package in EPEL) has checksum issues, so please make sure to install a current rclone release |
Install and configuration:
Download rclone for your platform from http://rclone.org/downloads/, unzip, and put the binary in your PATH. http://rclone.org/downloads/OS packages and containers are also available. For CentOS/RHEL just "yum install epel-release
" and "yum install rclone
" but then you must run "rclone selfupdate
" to avoid issues in the older v.1.57 version.
You can skip "rclone config" by using this template.
[caringodatacore]
type=s3
provider=Other
access_key_id=${S3_ACCESS_KEY}
secret_access_key=${S3_SECRET_KEY}
# Must NOT include default port 443 or 80 with rclone >1.47 to avoid signature errors!
endpoint=${S3_PROTOCOL}://${DOMAIN}:${S3_PORT}
location_constraint=
# The default 5MB part size is inefficient
chunk_size=100M
# This --s3-no-check-bucket option breaks mkdir but is required with gateway < 7.1 and rclone > v1.50 to avoid 409 errors (CLOUD-3213).
# no_check_bucket=true
For example, if your S3 domain / endpoint is "https://mydomain.cloud.caringodatacore.com" you can create a token with:
$ curl -i -u johndoedcuser -X POST --data-binary '' -H 'X-User-Secret-Key-Meta: secret' \
-H 'X-User-Token-Expires-Meta: +90' https://mydomain.cloud.caringodatacore.com/.TOKEN/ HTTP/1.1 201 Created ... Token c63d5b1034c6b41b119683a5e264abd0 issued for johndoedcuser in [root] with secret secret
...
Then add this entry to a ~/.rclone.conf
(or newer location ~/.config/rclone/rclone.conf
) file:
[caringodatacore]
type = s3
# Do NOT use V2 sigs, have seen signature problems.
# region = other-v2-signature
access_key_id = c63d5b1034c6b41b119683a5e264abd0
secret_access_key = secret
endpoint = https://mydomain.cloud.caringodatacore.com
location_constraint=
...
# The --s3-no-check-bucket option is only required with rclone > v1.50 and gateway < 7.1 to avoid 409 errors (CLOUD-3213).
# no_check_bucket=true
If you prefer a GUI client try the web-ui that rclone itself is able to serve: https://rclone.org/gui/
Here are some example rclone commands:
- List the buckets in your domain
$ rclone lsd caringodatacore:
-1 2015-03-16 20:13:52 -1 public
-1 2015-11-28 23:10:32 -1 inboxTransferred: 0 Bytes ( 0.00 kByte/s)
Errors: 0
Checks: 0
Transferred: 0
Elapsed time: 5.653212245s - Copy your Pictures directory (recursively) to a "old-pics" bucket. It will be created if it does not exist.
$ rclone copy --s3-upload-concurrency 10 --s3-chunk-size 100M '/Volumes/Backup/Pictures/' caringodatacore:old-pics
2016/01/12 13:55:47 S3 bucket old-pics: Building file list
2016/01/12 13:55:48 S3 bucket old-pics: Waiting for checks to finish
2016/01/12 13:55:48 S3 bucket old-pics: Waiting for transfers to finish
2016/01/12 13:56:45
Transferred: 2234563 Bytes ( 36.36 kByte/s)
Errors: 0
Checks: 0
Transferred: 1
Elapsed time: 1m0.015171105s
Transferring: histomapwider.jpg
... - List the files in the bucket
$ rclone ls caringorclone ls datacore:old-pics
Verfiy
6148 .DS_Store
4032165 histomapwider.jpg
... - Quickly see the size of the objects in a bucket:
$ rclone size jam:old-pics
Total objects: 173
Total size: 9.550 GBytes (10254108727 Bytes) - Verify all files were uploaded (note trailing slash is necessary on local directory!). The check command can also compare two buckets.
$ rclone check ~/Pictures/test/
jamdatacore:old-pics
2016/01/12 14:01:18 S3 bucket old-pics: Building file list
2016/01/12 14:01:18 S3 bucket old-pics: 1 files not in Local file system at /Users/
jamshid.../Pictures/test
2016/01/12 14:01:18 .DS_Store: File not in Local file system at /Users/
jamshid.../Pictures/test
2016/01/12 14:01:18 Local file system at /Users/
jamshid..../Pictures/test: 0 files not in S3 bucket old-pics
2016/01/12 14:01:18 S3 bucket old-pics: Waiting for checks to finish
2016/01/12 14:01:18 S3 bucket old-pics: 1 differences found
2016/01/12 14:01:18 Failed to check: 1 differences found
Note that "check" appears to be confused by the Mac OS X hidden directory ".DS_Store". - Tips: use "
-v
" and "--dump headers
" or "--dump bodies
" to see verbose details. - To ignore system files you don't want compared or uploaded use something like:
--excludes '.DS_Store' --exclude '.Trashes**' --exclude '.fseventsd**' --exclude '.Spotlight**' --exclude '._*'
- Increase the part size with
--s3-chunk-size
from the default 5M100M
(defaults to 5M) to improve the speed and storage efficiency of resulting large streams. - Speed up large transfers with "
--
transfers=10
" and "--s3-upload-concurrency
104
". - You might want to use
--s3-disable-checksum
when uploading huge files. - Unfortunately rclone does not copy or let you add metadata, though there are some enhancement requests on github.
- See if using "
rclone ls --fast-list datacore:mybucket
" speeds up your large bucket listings. This does not use "delimiter" listings, which starting with Gateway 7.6 are much faster than?delimiter=/
listings. - Copy a file from a plain http website into Swarm by streaming it directly:
- # rclone -v --dump headers copy commondatastorage:gtv-videos-bucket/sample/ElephantsDream.mp4 datacore:mybucket/sample-videos/
[commondatastorage]
type = http
url = https://commondatastorage.googleapis.com
- # rclone -v --dump headers copy commondatastorage:gtv-videos-bucket/sample/ElephantsDream.mp4 datacore:mybucket/sample-videos/
- Configure rclone using four environment variables instead of a config file:
$ export RCLONE_S3_ENDPOINT=https://support.cloud.datacore.com
# handy one-liner to create an S3 token
$ curl -fsS -u USERNAME -XPOST -H "X-User-Secret-Key-Meta: secret" ${RCLONE_S3_ENDPOINT}/_admin/manage/tenants/datacore/tokens | jq -r ".token,.secret" | { read RCLONE_S3_ACCESS_KEY_ID && read RCLONE_S3_SECRET_ACCESS_KEY && echo "export RCLONE_S3_ACCESS_KEY_ID=${RCLONE_S3_ACCESS_KEY_ID} RCLONE_S3_SECRET_ACCESS_KEY=${RCLONE_S3_SECRET_ACCESS_KEY} RCLONE_CONFIG_MYS3_TYPE=s3" ; }Enter host password for user 'USERNAME':
export RCLONE_S3_ACCESS_KEY_ID=0d4506108b8aa15f784d6ada317abb90 RCLONE_S3_SECRET_ACCESS_KEY=secret RCLONE_CONFIG_MYS3_TYPE=s3# Copy and paste that output setting the remaining three env variables into your shell
$ export RCLONE_S3_ACCESS_KEY_ID=0d4506108b8aa15f784d6ada317abb90 RCLONE_S3_SECRET_ACCESS_KEY=secret RCLONE_CONFIG_MYS3_TYPE=s3
# Now you can e.g. move all numbered directories 1-100 into Swarm
$ seq 1 100 | sed 's#$#/**#g' > /tmp/xx
$ rclone move -v --include-from /tmp/xx --delete-empty-src-dirs . MYS3:archive/old-builds/
$ seq 1 100 | xargs -n 1 rmdir # rclone only deletes the contents
- Mount a bucket as folder in your file system. If you do not use
--use-server-modtime
rclone will HEAD every object in the bucket which is very slow.$ mkdir /tmp/tickets
$ rclone mount --read-only --use-server-modtime support:tickets /tmp/tickets &
$ ls -lSh /tmp/tickets
WARNING: trying to use object storage like a file system usually makes neither client nor server happy. If you use this be sure the workload is consistent and test it well first. Use rclone purge to delete all object versions in a bucket and then delete the bucket. Note it uses individual DELETE requests instead of multi-delete requests (a POST bucket?delete with the list of objects in the body).
$ rclone purge --dry-run --transfers 20 datacore:old-pics