Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Replication is a proven and valuable mechanism to ensure data integrity, but the cost per GB of storage can get high as object sizes and cluster sizes grow. A complementary data protection strategy, erasure coding (EC), provides high data durability with a smaller footprint. Swarm manages EC and replication together to optimize cost-effectiveness, converting objects between them seamlessly and dynamically, based on the policies that you set.

...

For very large objects, Swarm creates multiple sets of erasure segments. The object breakdown into one or more erasure sets is transparent to external applications. A GET or HEAD of an erasure-coded object uses the same syntax as a replicated object.

...

If a hard drive or a node containing an erasure segment fails, Swarm can still read the object as long as there are still k total segments (any combination of original data or parity) remaining in the cluster. In other words, the protection against drive failure for the object is equal to the number of specified parity p segments.

For example, because the segments from a 5:2 (5 data segments with 2 parity segments for a total of 7 segments) or 8:2 (8 data and 2 parity segments for 10 total segments) erasure code are distributed to different nodes, they are protected against the loss of any two nodes. An erasure-coded object is immediately retrievable when accessed even if some segments are missing. However, regenerating the missing erasure set segments is still performed in a self-healing, cluster-initiated manner (similar to the recovery process for replicated objects) to protect against further drive loss. This process kicks off automatically when a missing volume is detected and automatically regenerates any missing segments.

...

(total segments ÷ data segments ) × object size= object footprint
((k+p) ÷ k) × GB=  total GB

How footprint changes with different EC encoding (versus 3 reps)

  • 1 GB object with 5:2 encoding: ((5 + 2) ÷ 5) × 1 GB = 1.4 GB (vs. 3 GB for replication)
  • 3 GB object with 5:2 encoding: ((5 + 2) ÷ 5) × 3 GB = 4.2 GB (vs. 9 GB for replication)
  • 3 GB object with 7:3 encoding: ((7 + 3) ÷ 7) × 3 GB = 4.3 GB (vs. 9 GB for replication)

...