There are three types of storage policies in Swarm: replication, erasure coding, and versioning which can be customized at the level of domains and buckets, but this section concerns the Swarm settings that control the cluster-wide requirements. These policies appear in the Policy section of the Cluster Settings in the Swarm UI:
Settings that show an SNMP name are persisted settings and can be updated dynamically without a cluster restart. See Swarm Storage Policies.
Replication Policy
See Implementing Replication Policy for how to create custom replication policies on specific domains and buckets.
Setting | Default | Description |
---|---|---|
policy.replicas SNMP: policyReplicas | min:2 max:16 default:2 anchored | The minimum, maximum, and default number of replicas allowed for objects in this cluster. Can differ from the policy in a replicated target cluster. Examples:
|
Erasure Coding Policy
See Implementing EC Encoding Policy for how to create custom EC encoding policies on specific domains and buckets.
Setting | Default | Description |
---|---|---|
ec.conversionPercentage SNMP: ecConversionPercentage | 0 | Percentage, 1-100; 0 stops all conversion. Adjusts the rate at which the Health Processor consolidates multi-set erasure-coded objects each HP cycle. Lower to reduce cluster load; increase to convert a large number of eligible objects faster, at the cost of load on the cluster. If enabled, requires policy.eCEncoding to be specified. |
ec.maxManifests | 6 | Range, 3-36. The maximum number of manifests written for an EC object. Usually, p+1 are written for a k:p encoding. Requirement: Manifests must all be written to different nodes, even when using ec.protectionLevel=volume. Do not set above 6 unless directed by Support. |
ec.minParity | -1 | Range -1 or 1-4; default of -1 is max(policyminreps - 1, 1), where policyminreps is the min value in policy.replicas. The minimum number of parity segments the cluster requires. This is the lower limit on p for EC content protection, regardless of the parity value expressed on individual objects through query arguments or lifepoints. |
ec.protectionLevel SNMP: ecProtectionLevel | node | Either 'node', 'subcluster', or 'volume'. At what level segments must be distributed for an EC write to succeed. Note: multiple segments are allowed per level, if needed. 'node' (default) distributes segments across the cluster's physical/virtual machines. 'subcluster' requires node.subcluster to be defined across sets of nodes. You must have (k+p)/p nodes/subclusters for those levels; at minimum, you must have k+p volumes. See details below. |
ec.segmentConsolidationFrequency SNMP: ecSegmentConsolidationFrequency | 10 | Percentage, 1-100, 0 to disable. How quickly the health processor consolidates object segments after ingest. Increase this value (such as to 25, to consolidate over 4 HP cycles) to make new content readable sooner by clients. For multipart uploads via S3 clients, 10 is recommended; for SwarmNFS, 100 is recommended, with extra space allowances for trapped space. Consolidation changes the ETag (which affects If-Match requests) and Castor-System-Version headers, but Content-MD5 and Composite-Content-MD5 headers are unchanged. Therefore, have clients use the hash and last-modified date, rather than ETag, to find if an object has changed. |
ec.segmentSize | -1 | In bytes; default of -1 implies 200 MB, with recommended minimum of 100 MB. The maximum size allowed for an EC segment before triggering another level of erasure coding. For mostly large (1+ GB) objects, increase to minimize the number of EC sets, which reduces index memory usage. Alternatively, increase the size as needed per write request using the 'segmentsize' query argument. |
policy.eCEncoding SNMP: policyECEncoding | unspecified anchored | The cluster-wide setting for the EC (erasure coding) encoding policy. Valid values: unspecified, disabled, k:p (a tuple such as 5:2 that specifies the data (k) and parity (p) encoding to use). Add 'anchored' to set this cluster-wide; remove it to allow domains and buckets to have custom encodings. |
policy.ecMinStreamSize SNMP: policyECMinStreamSize | 1MB anchored | In integer units of megabytes (MB) or gigabytes (GB); must be 1MB or greater. The size that triggers an object to be erasure-coded, if specified (by eCEncoding, lifepoint, query argument) and allowed by policy. Below this threshold, objects are replicated unless they are multipart or chunked writes. Add 'anchored' to set this cluster-wide; remove it to allow domains and buckets to have custom values. |
ec.convertToPolicy | False | The setting was added in Swarm v14.1 to convert EC objects to the current policy encoding at a rate based on ec.conversionPercentage. In Swarm v15.0, EC objects that are smaller than the minimum EC encoding size (based on policy) will be converted to whole replicas. The setting is by default disabled and can be enabled from the UI. If enabled, the related setting “ec.conversionPercentage“ must also be set to a non-zero value for conversions to occur. Note In Swarm 15.0, the conversion applies to both historical and current versions. After the conversion, it takes health.segLifepointUpdateInterval (default is one day) until HP deletes the original EC segments. |
What EC Protection Level is needed?
The EC protection level determines how strictly EC segments must be distributed for a write to succeed, or else return an error (412 Precondition Failed) to the writing application. After Swarm writes an object to the cluster, the health processor attempts to maintain the requested protection level. If cluster resources become unavailable, it degrades gracefully. When this occurs, the health processor logs errors, alerting you that the requested protection cannot be maintained and data may be at risk.
Regardless of the protection level set, Swarm always makes a best effort to distribute segments as broadly as possible across hardware, to protect data.
ec.protectionLevel | Cluster requirements | Effect |
---|---|---|
subcluster | >= (k+p)/p subclusters | Requires a subcluster for every p segments. Use only if geographical or systems-based subclusters are defined to factor into content protection. |
node (default) | >= (k+p)/p nodes | Requires a node for every p segments. Use for most situations. Important: When working with a small number of nodes, verify the EC encoding can support what exists.
|
volume | >= k+p volumes | Least protection. Requires k+p volumes, but p+1 nodes are still needed because the manifest must be written to separate nodes. Use only if you have insufficient nodes for node-based protection. |
< k+p volumes | Unsupportable. EC writes fail. |
Note
Using ec.subclusterLossTolerance = volume
is not recommended as it may not provide data availability when one or more nodes are down for maintenance.
Deprecated
The setting ec.subclusterLossTolerance
has been deprecated, therefore, remove it from configurations when upgrading to Swarm 10.
Versioning Policy
Swarm has policy support for object versioning. Versioning can be enabled for specific contexts (domains and buckets) after the cluster is configured to permit versioning of objects.
See Implementing Versioning for how to create versioning policies on specific domains and buckets.
Setting | Default | Description |
---|---|---|
policy.versioning SNMP: policyVersioning | disallowed | Specifies whether versioning is allowed to be enabled on contexts (domains and buckets) within the cluster. Valid states: disallowed, suspended, allowed. This policy overrides context-level policies. Disallowed removes historical versions, if any. Suspended stops creation of new versions but retains version history. Examples:
|