Table of Contents | ||
---|---|---|
|
...
The security of knowing backups are continuous, have is continuous. It has minimal latency , and require requires little intervention and monitoring by implementing an S3 backup feed from Swarm. Using Swarm's feed mechanism for backup leverages numerous existing strengths: the
The long-term iteration over objects in the cluster
...
A proven method for tracking work as it is performed
...
Support for TLS network encryption and forward proxies. Using the parallelism of the entire Swarm cluster makes the best use of network bandwidth
...
while sending the backups through an optional forward proxy allows implementing bandwidth throttling if needed.
Back up — S3 Backup is an integral part of the operating Swarm cluster. In the Swarm UI, create a new feed of type S3 Backup, and provide credentials and information about the network path to the service. After the feed is started, progress can be monitored and warning warnings of blockages and particular object failures can be sent, as with any other feed. The S3 Backup feed honors the versioning settings in a cluster, as enabled, disabled, or suspended throughout the domains and buckets. While multiple S3 Backup feeds can be created, each one requires a dedicated target bucket.
Clean up — No action is needed to keep the backup current and trimmed. When disabling Swarm versioning on buckets or domains, delete the buckets or /domains, or have expire the object lifepoints expire, as the Swarm feeds mechanism processes the expired content as deleted, allowing the S3 Backup feed to clear them from the S3 bucket. Throughout content additions and deletions, the total number of objects in the S3 bucket always approximates twice the number of logical objects backing up from the source cluster (because AWS functionality requires there to be one for the object's content and another for metadata).
Restore — The Restore tool runs outside of Swarm, using a command-line interface for executing the data and restoration tasks. Restore what is needed: , either the entire cluster , or some portions of the cluster. Swarm supports bulk restores at the granularity of cluster, domain, or bucket, as well as more surgical restores of a few objects. Multiple copies can be run to achieve a faster, parallel recovery. See the S3 Backup Restore Tool.
...
Cold storage offers the lowest monthly prices per byte stored compared to the standard storage classes.
Standard storage classes have low-latency retrieval times, which can allow a Swarm Restore to complete in a single run.
Cold storage has longer retrieval latency, as much as 12-48 hours for S3 Glacier Deep Archive, to pull content from archival storage. Depending upon how a restore is performed, the Swarm Restore tool may need to be run multiple times over several hours to complete a restoration.
Cold storage incurs additional charges for egress and API requests to access the backup, so it is best suited to low-touch use cases.
S3 Glacier Deep Archive rounds up small objects, so the overall footprint being charged may be larger because of Swarm's use of metadata objects.
...
To implement an S3 backup feed, first complete a one-time set up setup of the S3 side: set . Set up an account with an S3 cloud service provider and then create an S3 bucket dedicated to backing up this cluster.
...
While these instruction steps are for AWS S3 (see also S3 Backup Feeds to Wasabi), S3-based public cloud providers have a similar setup process:
Service — Sign up for Amazon S3 if needed.
Navigate to aws.amazon.com/s3 and choose select Get started with Amazon S3.
Follow the on-screen instructions.
AWS notifies by email when the account is active and ready to use.
Note: S3 is accessed for the new bucket but the separate IAM service for the new user:
Bucket — Create a bucket dedicated to backing up the Swarm cluster.
Sign in and open the S3 console: console.aws.amazon.com/s3
Choose Create bucket. (See S3 documentation: Creating a Bucket.)
On tab 1 - Name and region, make the initial entries:
For Bucket name, enter a DNS-compliant name for the new bucket. This cannot be changed later, so choose well:
The name must be unique across all existing bucket names in Amazon S3.
The name must be a valid DNS name, containing lowercase letters and numbers (and internal periods, hyphens, underscores), between 3 and 64 characters. (See S3 documentation: Rules for Bucket Naming.)
Tip: For easier identification, incorporate the name of the Swarm cluster that this bucket is dedicated to backing up.
For Region, choose the one that is appropriate for business needs. (See S3 documentation: Regions and Endpoints.)
On tab 2 - Configure options, take the defaults. (See S3 documentation: Creating a Bucket, step 4.)
Best practice: Do not enable versioning or any other optional features, unless it is required for the organization.On tab 3 - Set permissions, take the default to select Block all public access; now the bucket owner account has full access.
Best practice: Do not use the bucket owner account to provide Swarm's access to the bucket; instead, create a new, separate IAM user that holds the credentials to share with Swarm.Choose Create, and record the fully qualified bucket name (such as "
arn:aws:s3:::example.cluster1.backup
") for use later, in policies.Record these values for configuring the S3 Backup feed in Swarm:
Bucket Name
Region
User — Create a programmatic (non-human) user dedicated to Swarm access.
On the Amazon S3 console, select the service IAM (Identity and Access Management) , and click Users.
Add a dedicated user, such as
caringo_backup
, to provide Programmatic access for Swarm.The IAM console generates an access key (an access key ID + secret access key), which must be recorded immediately.
(See S3 documentation: Managing Access Keys for IAM Users and Understanding and Getting Your Security Credentials.)This is the sole opportunity to view or download the secret access key, so save it in a secure place.
Record the fully qualified user (such as "
arn:aws:iam::123456789012:user/caringo_backup
") for use later, in policies.Record these values for configuring the S3 Backup feed in Swarm:
Access Key ID
Secret Access Key
Policies — Create policies on both the user and the bucket so the programmatic user has exclusive rights to the S3 bucket. Use the policy generators provided or enter edited versions of the examples below.
Create an IAM policy for this user, allowing it all S3 actions on the backup bucket, which need to be specified as a fully qualified
Resource
(recorded above), starting witharn:aws:s3:::
IAM policy
Code Block language xml { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": "s3:*", "Resource": "arn:aws:s3:::example.cluster1.backup" } ] }
Create a matching bucket policy to grant access to the dedicated backup user, which need needs to be specified as a fully qualified
Principal
, which is the User ARN (recorded above) starting witharn:aws:iam::
(See S3 Using Bucket Policies.)
Using the Policy Generator, allow all S3 actions for the bucket, using the full ARN name:Bucket policy
Code Block language xml { "Id": "Policy1560809845679", "Version": "2012-10-17", "Statement": [ { "Sid": "Stmt1560809828003", "Action": "s3:*", "Effect": "Allow", "Resource": "arn:aws:s3:::example.cluster1.backup", "Principal": { "AWS": [ "arn:aws:iam::123456789012:user/caringo_backup" ] } } ] }
Best practice for security: After implementing the S3 Backup feed in Swarm, write a script to automate rotation of automate the rotation of the S3 secret access key on a regular basis, including updating updates in the S3 Backup feed definition in Swarm (using the management API call, given in Rotating the S3 Access Key, below).
...