The S3 Backup Restore Tool is the standalone utility for performing DR from your the S3 backup bucket, either to the original cluster or to an empty cluster that is meant to replace the original. See S3 Backup Feeds.
Once your the data is backed up in S3, the restore tool allows both examining a backup and control how, what, and where it is restored:
List all domains and buckets, or the buckets within a domain, with the logical space used for each.
List all objects within a bucket or unnamed objects in a domain, optionally with sizes and paging.
Restore either the complete cluster contents or else a list of domains, buckets, or individual objects to restore.
Rerun the restore, should any part of it fail to complete.
Partition your the restoration tasks across multiple instances of the command line tool, to run them in parallel.
...
The S3 Backup Restore tool has its own a separate install package included in your the Swarm download bundle. Install it on one or more (for parallel restores) systems where you want to run the restore processes run.
Info |
---|
RequiredThe S3 Backup Restore Tool must be installed on a system that is running RHEL/CentOS 7. |
Preparation (one-time
...
)
The swarmrestore package is delivered as a Python pip3 source distribution. You will need to prepare each machine Each machine needs to be prepared to be able to install this and future versions of swarmrestore.
As root, run the following command:
Code Block language bash yum install python3
Verify that you have version 3.6 is installed:
Code Block language bash python3 --version
Installation
If you have the Uninstall Python 2 generation of the tool (caringo-swarmrestore-1.0.x.tar.gz
) , first uninstall that versionif installed:
Code Block |
---|
pip uninstall caringo-swarmrestore |
From then on, whenever you get Rerun this installation whenever a new versions version of swarmrestore , rerun this installationis obtained:
Copy the latest version of the swarmrestore package to your the server.
As root, run Run the following as root:
Code Block language bash pip3 install caringo-swarmrestore-<version>.tar.gz
At this point, swarmrestore should be swarmrestore is likely in
/usr/local/bin
and is likely already on your in the path.Repeat for any additional servers , if you plan planning to do perform partitioning for parallel restores.
...
The tool uses a configuration file, .swarmrestore.cfg. Because the file contains sensitive passwords, the tool warns you if the configuration file is not access-protected (chmod
mode 600 or 400).
...
Section | Settings | |
---|---|---|
[s3] |
| |
[s3] archival only | If you are Set these additional parameters if using an S3 bucket with an archival storage class (Glacier, Glacier Deep Archive), set these additional parameters:
| |
[forwardProxy] | This section is for use only with an optional forward proxy:
| |
[log] | You may use the The same log settings as your the Swarm cluster ; if you do so, may be used; identify the logs by looking for those with the component "
| |
[swarm] |
|
...
Gateway — Add the IP of the machine where the Restore tool will run runs to the Gateway configuration setting
scsp.allowSwarmAdminIP
if communicating with a Swarm cluster via Gateway.
...
Info |
---|
Full cluster restoreBefore undertaking a restore of a large cluster, contact DataCore Support. They will help you balance the speed of your the restore with your bandwidth constraints by examining the space used by the S3 backup bucket, estimating the bandwidth needed, and recommending best use of the If you are using an AWS Glacier storage class, you may pull your AWS bucket The AWS bucket may be pulled out of cold storage before your the full cluster restore by changing the storage class to Standard if using an AWS Glacier storage class. |
The restoration tool runs using batch-style operation with commands given on the command line. The tool will log its logs the actions to the log file or server in the log configuration section. The restoration tool uses the following command format:
...
Info |
---|
Specifying objects
|
...
Enumeration and selection are handled by the ls
command, which is modeled after the Linux command ls and whose results are captured with standard Linux stdout
. Use the command to visualize what domains and buckets you have been backed up in S3 and are available to be restored. By default, the The output is sorted by name and interactively paginated to help you manage large result sets by default.
The ls
subcommand has this format:
...
-R or --recursive
— Recursively lists the given domain or bucket, or else the entire cluster. Without this option, the command lists only the top-level contents of the object.-v or --versions
— List previous versions of versioned objects. Versions are not listed by default.-l or --long
— Lists details for each item returned in the output:Creation date
Content length of the body
ETag
Archive status:
AN — Archived; not available for restoration
AR — Archived with an archive restore in progress; not available for restoration
AA — Archived with a copy available for restoration
OK — Not archived and fully available
Objectspec
Alias UUID, if the object is a domain or bucket
<objectspec>
— If none, the command runs across the entire contents of the S3 backup. If present, filters the command to a specific domain or bucket (context object) in Swarm. Use this format:
...
Info |
---|
NoteUse the double-slash format ( |
When you run running the command without any options, it returns the list of domains that are included in this S3 bucket for your the Swarm cluster:
Code Block | ||
---|---|---|
| ||
>>> swarmrestore ls domain1/ domain2/ www.testdomain.com/ |
If you wanted Run a command like this if wanting a complete accounting of every object backed up for a specific domain, run a command like this, redirecting to an output file:
...
Info |
---|
NoteUse the double-slash format ( |
You can use any Any number of command options can be used, and you may combine the short forms forms may be combined with a single dash (-Rv
). The <objectspecs>
, -R
, and -v
options iterate over objects the same way as the ls
command.
...
-R or --recursive
— Recursively restore domains, buckets, or the entire cluster with an empty object spec. See above for what will be is iterated over when -R is not used.-v or --versions
— Include previous versions of versioned objects. They are not included by default.-f <file> or --file <file>
— Use objectspecs from a file instead of the command line.-p <count>/<total> or --partition <count>/<total>
— Partition work for a large restore job (but every instance will restore restores buckets and domains before objects).Example: To run 4 instances in parallel, configure each option to be one of the series:
-p 1/4, -p -2/4, -p -3/4, -p -4/4
-n or ---noop
— Perform the checking of a restore, but do not restore any objects.Does not change the cluster state. The option can be used before and after a restore, as both a pre-check and a verification.
<objectspecs>
— Any number; newlines separate objects. If none, the top level of the cluster’s backup contents is the scope.Using no object specification with the command options
-Rv
causes Swarm to restore all backed up objects in the entire cluster, including any historical versions of versioned objects.
What gets restored: Restore will copy copies an object from S3 to the cluster only if the cluster object is missing or else older than the S3 object. Note that context objects always restore before the content they contain: restore will first restore restores any domains or buckets needed before restoring objects within them.
...
current
— The object was not restored because the target cluster already has the same version of the object.older
— The object was not restored because it is older than the one in the target cluster.obsolete
— The object was not restored because the cluster does not allow the object to be written. Usually, it means the object has been deleted.needed
— The object needs restoration, but the -n option was used.restored
— The object was successfully restored.nocontext
— The object cannot be restored because its parent domain or bucket cannot be restored.failure
— The object cannot be restored. Consult the logs for details.archived
— The object is archived and the restore tool is not configured for archive restoration. This is a failure condition.initiated
— The object is archived and the tool has issued an object restoration request. See the Amazon S3 API RestoreObject Request Syntax. This is also a failure condition, but the object will be is counted in the archive retrieval initiated stats. It is these operations that incur expense to the bucket owner by the restore tool.ongoing
— The object is in archive and a restoration request has already been initiated. Restoration from archive is in progress. This is also a failure condition.
Rate of Restore — Restoration may take a long time run, especially if recursion (-R
) is used on domains or buckets. To boost the rate of restore, you can install the S3 Backup Restore tool on multiple servers and then run the restore command with partitioning parameters (-p
) across all the instances of the tool, which allows restoring faster in parallel, with minimal overlap.
Headers for Audit — When your the S3 Backup feed writes an object to the S3 bucket, it adds to the S3 copy a header (Castor-System-Tiered) that captures when and from where the object was tiered. When the S3 Backup Restore tool writes the S3 object back to Swarm, it includes that S3 header and then adds another one of the same, to capture when and from where the object was restored. These paired headers (both named Castor-System-Tiered) provide the audit trail of the object's movement to and from S3. Swarm persists these headers but does not include them in Entity-MD5 or Header-MD5 calculations. The dates are of the same format as Last-Modified (RFC 7232, section 2.2). See SCSP Headers.
...