To work with very large objects or objects of unknown length, you need to use the advanced options that are incorporated in Swarm Elastic Content Protection:
- Erasure coding (EC), which segments and stores large objects efficiently and securely
- Multipart Write, which divides an object into multiple parts and uploads them simultaneously
These are key terms used in Swarm elastic content protection:
Chunked transfer encoding | Used in WRITE, UPDATE, and APPEND SCSP methods to send objects of an undetermined content length to a storage cluster. The exact request header is: Transfer-Encoding: chunked. See RFC 7230 3.3.1. Note COPY rewrites the object manifest only. |
---|---|
Erasure coding | Describes one of the ways an object can be protected in a storage cluster. A large object written to the cluster using erasure coding is automatically stored on disk as a set of data and parity segments. This process ensures both content protection and optimal storage usage for large objects. Swarm has configuration parameters that enable an object to be automatically erasure-coded on the drive. |
Manifest | Swarm object containing a list of the segments that comprise a large object. |
Dividing Objects with Erasure Coding
Swarm allows writing large objects of known length using the erasure coding option incorporated in the Swarm Elastic Content Protection. With this option, you can divide the object into smaller segments and encode it with additional parity segments that provide data protection.
Additionally, you can write (POST, PUT, COPY, APPEND) objects of unknown length to a cluster using standard HTTP chunked transfer encoding. Objects sent to the cluster using chunked transfer encoding are erasure-coded when stored on disk, using the encoding type specified by either cluster configuration or request query arguments. This feature allows you to store large objects and streaming media in the cluster.
Storing Large Objects
You can store an object as large as 4TB in the cluster. Erasure coding is seamless and transparent to the application, automatically partitioning the object into segments, encoding them, and distributing the segments throughout the cluster. When you configure the cluster, you set the threshold for when objects become erasure coded; in addition, applications can control which objects get erasure-coded on an individual object basis. See Erasure Coding EC.
Attempting to store an object larger than 4TB will result in a 400 Bad Request response immediately after the write is submitted.
Increasing allowed object size
Storing Streaming Media
Streaming media is supported using industry-standard chunked transfer encoding. Your application can now stream digital media or other types of data to the cluster without knowing the object size in advance. The size of the object is limited only by the available space in the cluster (up to 4TB). Attempting to store a chunked encoded object larger than 4TB will result in a 400 Bad Request response (see note above).
Any object written with HTTP chunked transfer encoding must be erasure-coded and cannot be replicated. If you write an object and specify both erasure coding and replication in the header (for example, combining an encoding=5:2 query argument with a lifepoint header with a reps= parameter), the write operation will result in a 400 Bad Request response.