Working with Large Objects
To work with very large objects or objects of unknown length, you need to use the advanced options that are incorporated in Swarm Elastic Content Protection:
Erasure Coding EC (EC), which segments and stores large objects efficiently and securely
Multipart Write, which divides an object into multiple parts and uploads them simultaneously
These are key terms used in Swarm elastic content protection:
Chunked Transfer Encoding | Used in WRITE, UPDATE, and APPEND SCSP methods to send objects of an undetermined content length to a storage cluster. The exact request header is: Transfer-Encoding: chunked. See RFC 7230 3.3.1. NoteCOPY rewrites the object manifest only. |
---|---|
Erasure Coding | Describes one of the ways an object can be protected in a storage cluster. A large object written to the cluster using erasure coding is automatically stored on disk as a set of data and parity segments. This process verifies both content protection and optimal storage usage for large objects. Swarm has configuration parameters that enable an object to be automatically erasure-coded on the drive. |
Manifest | Swarm object containing a list of the segments that comprise a large object. |
Dividing Objects with Erasure Coding
Swarm allows writing large objects of known length using the erasure coding option incorporated in the Swarm Elastic Content Protection. With this option, you can divide the object into smaller segments and encode it with additional parity segments that provide data protection.
Additionally, you can write (POST, PUT, COPY, APPEND) objects of unknown length to a cluster using standard HTTP chunked transfer encoding. Objects sent to the cluster using chunked transfer encoding are erasure-coded when stored on disk, using the encoding type specified by either cluster configuration or request query arguments. This feature allows you to store large objects and streaming media in the cluster.
Storing Large Objects
You can store an object as large as 4TB in the cluster. Erasure coding is seamless and transparent to the application, automatically partitioning the object into segments, encoding them, and distributing the segments throughout the cluster. When you configure the cluster, you set the threshold for when objects become erasure coded; in addition, applications can control which objects get erasure-coded on an individual object basis. See Erasure Coding EC.
Attempting to store an object larger than 4TB results in an HTTP 400 Bad Request response immediately after the write is submitted.
Increasing Allowed Object Size
To store objects larger than 4TB, increase the limit that is set by ec.maxSupported (defaults to 4398046511104) and also set ec.segmentSize (defaults to 200000000) to a value proportionately larger. On a full read, Swarm must load the entire manifest; increasing the segment size minimizes the size of the manifest and so the number of socket connections required to read an entire EC object. (SWAR-7823)
Storing Streaming Media
Streaming media is supported using industry-standard chunked transfer encoding. Your application can now stream digital media or other types of data to the cluster without knowing the object size in advance. The size of the object is limited only by the available space in the cluster (up to 4TB). Attempting to store a chunked encoded object larger than 4TB results in an HTTP 400 Bad Request response (see note above).
Any object written with HTTP chunked transfer encoding must be erasure-coded and cannot be replicated. If you write an object and specify both erasure coding and replication in the header (for example, combining an encoding=5:2 query argument with a lifepoint header with a reps= parameter), the write operation results in an HTTP 400 Bad Request response.
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.