Swarm Storage 14.0.1 Release

New Features

Erasure Coding Improvements - Swarm 14 includes Erasure Coding related improvements:

  • With Swarm 14.0 indexed erasure coded (EC) objects include the field "ec_encoding" which records the current EC coding of the object. Non-EC objects do not have this field. (SWAR-6653)

  • Swarm now computes the data footprint of Erasure Coding EC segments and whole replicas objects separately during each HP cycle so the relative space usage of whole replicas vs EC can inform space usage policy decisions. (SWAR-9160)

Additional Changes

These items are other changes, including those that come from testing and user feedback.

OSS Versions

See Third-Party Components for Storage 14.0.1 for the complete listing of packages and versions for this release.

Fixed in 14.0

  • Reboot Loop Due to a Bad Disk: The Swarm node recognizes the volume as failed and alerts the cluster to the failure when a volume fails at mount time. The node operates with the remaining volumes, so physically removing the volume may be necessary. (SWAR-9189)

  • Remove Legacy Nonce Handling: Remove scsp.forceLegacyNonce settings from the node.cfg files prior to upgrading to 14.0. (SWAR-9108)

  • Bucket Listings: During a node reboot, such as a rolling reboot of the cluster, a newly booted node temporarily returns an empty result set for a listing query. (SWAR-9083)

  • S3 Backup Feed: A 5G object size limitation is removed. (SWAR-8554)

Upgrade Impacts

Required

Complete the migration to Swarm 11.3 and ES 6.8.6 before upgrading to Swarm 14 if running older Elasticsearch (5.6.12 or 2.3.3). See here, Upgrading from Unsupported Elasticsearch.

These items are changes to the product function that may require operational or development changes for integrated applications. Address the upgrade impacts for each of the versions since the currently running version:

Impacts for 14.0

  • Change in the node.cfg file: The previously deprecated sysctl section of the node.cfg file is removed. Use kernel.sysctlFileUrl (introduced in Swarm 12.0) instead if it is necessary to set kernel runtime parameters, . (SWAR-8968)

    Settings changes

    • Updated:

      • All sysctl.* settings are removed. (SWAR-8968)

      • support.reportPeriod default is changed to 21600 (6 hours). (SWAR-8424)

      • The following settings are now persisted cluster settings that can be updated via SNMP and the UI (SWAR-9115)

        • cluster.enforceTenancy

        • cluster.proxyIPList

        • ec.maxManifests

        • ec.minParity

        • ec.segmentSize

        • feeds.retry

        • health.parallelWriteTimeout

        • health.underreplicationAlertPercent

        • health.underreplicationTolerance

        • health.persistentUnderreplicationAlertPercent

        • log.obscureUUIDs

        • scsp.clientPoolTimeout

        • scsp.defaultContextReplicas

        • scsp.defaultROWAction

        • scsp.maxWriteTime

        • scsp.validateOnRead

        • search.numberOfShards

  • Swarm storage node metrics are deprecated and are replaced in the next major release by the graphs and reporting from Prometheus Node Exporter and Grafana. The storage administration UI is updated to allow for metrics to be turned off. Clear metrics.target from the configuration, uninstall caringo-elasticsearch-metrics, and issue the following cURL to clear the space in the Elasticsearch cluster. (SWAR-8982)

    curl -XDELETE 'http://ELASTICSEARCH:9200/metrics-*'
  • Differences in scsp.forceLegacyNonce configuration depending on the version upgrading from (SWAR-9020):

  • Currently running a Swarm Storage version prior to 11.1, and upgrading to 11.1, 11.2, 11.3, 12.0 or 12.1:

    Before upgrading, set scsp.forceLegacyNonce=true in the node.cfg file. After the upgrade, when the cluster is fully up, update scsp.forceLegacyNonce=false using swarmctl and change scsp.forceLegacyNonce=false in the node.cfg file.

    Currently running a Swarm Storage version 11.1, 11.2, 11.3, 12.0 or 12.1 and upgrading to another version from that list:

    Before upgrading, verify scsp.forceLegacyNonce=false is in the node.cfg file and verify using swarmctl that scsp.forceLegacyNonce=false in a cluster.

  • Subcluster assignments can no longer be blank, and CSN installations with mixed subcluster assignments have the unassigned nodes unable to boot, showing an error in contacting the time source. Supply a subcluster for each node if any named subcluster is specified in a cluster. (SWAR-7675)

Use swarmctl to Check or Change Settings

Use 'swarmctl -C scsp.forceLegacyNonce' to check the value of scsp.forceLegacyNonce.

Use 'swarmctl -C scsp.forceLegacyNonce -V False' to set the value to false.

For more details, see https://support.cloud.caringo.com/tools/Tech-Support-Scripts-Bundle-swarmctl.pdf.

Cumulative Impacts

Address all upgrade impacts for each version released since the version being upgraded from.

Review the comprehensive Upgrade Impacts listed for the Swarm Storage 14.0.1 Release.

Watch Items and Known Issues

The following watch items are known:

  • A node fails to mount all disks in the node if a node mounts an encrypted volume that is missing the encryption key in the configuration. (SWAR-8762)

  • S3 Backup feeds do not backup logical objects greater than 5 GB; those writes fail with a CRITICAL log message. (SWAR-8554)

  • The chassis shuts down but does not come back up when restarting a cluster of virtual machines that are UEFI-booted (versus legacy BIOS). (SWAR-8054)

These are standing operational limitations:

  • The Storage UI shows no NFS config if the Elasticsearch cluster is wiped. Contact DataCore Support for help repopulating the SwarmFS config information. (SWAR-8007)

  • Any incomplete multipart upload into a bucket leaves the parts (unnamed streams) in the domain if a bucket is deleted. To find and delete them, use the s3cmd utility (search the Support site for "s3cmd" for guidance). (SWAR-7690)

  • Invalid config parameters that prevent the unassigned nodes from booting are created if subcluster assignments are removed in the CSN UI. (SWAR-7675)

  • False 404 Not Found and other SCSP errors may display during rolling reboot in versions 11.1 through 12.0.1. Set scsp.forceLegacyNonce=False in the cluster configuration to mitigate this problem. This setting needs to be removed before upgrading to 12.1.0 or later. (SWAR-9020)

  • S3 Backup restoration to the cluster may be blocked if the certificate is not located where Swarm expects it when using certificates with HAProxy. From 12.1, a clearer error message draws attention to the issue for S3 backup and replication feeds that are blocked due to invalid X.509 ("SSL") certificates. (SWAR-8996)

To upgrade Swarm 9 or higher, proceed now to How to Upgrade Swarm. For migration from Swarm 8.x or earlier, contact DataCore Support for guidance.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.