Table of Contents |
---|
...
Table of Contents | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|
|
Swarm can protect the disk by creating multiple copies of each object on different nodes called replicas. Control how many replicas are created for each object and how quickly they are created after the object is initially stored in the cluster.
...
Note
There is one instance of an object in the cluster if one object replica exists in a cluster. In this context, replica, instance, and object are all synonymous.
How Replication Affects Risk
By default, each object in Swarm is stored with two replicas, with each replica residing on a different node in the cluster. Replicas are distributed across subclusters if a cluster is configured to use subclusters.
...
There can also be a potential period of vulnerability at the moment an object is first stored on Swarm if Replicate On Write option to create multiple simultaneous replicas is not used.
Controlling Replication Protection
A rapid sequence of drives to fail is possible, but unlikely. The solution is to change the replication requirements if this presents an unacceptable risk for an application. Changing the default replication requirements to a larger greater number of replicas allows a trade of disk space savings for added security.
Infonote |
---|
CautionSpecifying too many replicas relative to nodes has consequences. Setting the number of replicas equal to the number of storage nodes can lead to uneven loading when responding to volume recoveries. |
...
Code Block | ||
---|---|---|
| ||
policy.replicas: min=2 max=5 default=3 |
Infowarning |
---|
DeprecatedThe cluster setting policy.replicas replaces the following three, which are all deprecated: |
See See Implementing Replication Policy .
Increasing Replication Priority
By default, Swarm writes a new object to one node, responds to the application with a success code and UUID (or name), and then quickly replicates the object as needed to other nodes or subclusters. The replication step is performed as a lower priority task.
While this creates the best balance of throughput and fault tolerance in most circumstances, there are cases where you might may want to provide the replication task the same priority as reads and writes, which guarantees replication occurs quickly even under heavy sustained loads.
...
With replication set to priority 1, object replication is interleaved in parallel with other operations. This might may have a negative impact on cluster throughput for use cases involving sustained, heavy writes. With health.replicationPriority = 1
, it is still possible (though much less likely) that the failure of a node or volume can cause some recently written objects to be lost if the failure occurs immediately after a write operation but before replication to another node can be completed.
Enforcing Replicate On Write (ROW)
Another replication strategy to protect content is Replicate on Write (ROW).
Without ROW, the client writes a single copy and depends on the Health Processor (HP) to create the necessary replicas. Relying on HP leaves open a small window for data loss: the volume containing the node holding the sole copy can fail before HP completes replication. ROW eliminates the window by guaranteeing all replicas are written on the initial request.
How it works: The The ROW feature requires Swarm to create replicas in parallel before it returns a success response to the client. ROW protection applies to WRITEto WRITE, UPDATE UPDATE, COPY, and APPEND requests COPY, and APPEND requests. The secondary access node (SAN) sets up connections to the number of available peers required to create the needed replicas when ROW is enabled.
See SeeConfiguring ROW Replicate On Write.
...