Swarm Storage 12.1 Release

New Features

  • Performance Gains

    • The maximum CPU core count is increased from 64 to 512. (SWAR-9061)

    • Population of the overlay index at startup is improved which allows it to reach an authoritative state more quickly. (SWAR-9042)

    • The number of simultaneous requests to port 90 (legacy console) and port 91 (management API) are now metered. This limits the operational impact of something like an errant monitoring system. (SWAR-8985)

  • Health Processing and Monitoring

    • Swarm now gathers and publishes incremental statistics for TCP and UDP inter-cluster communications. These statistics can be used to help pinpoint cluster-specific network issues. (SWAR-9047)

    • Swarm now performs defragmentation (trapped space reduction) more evenly throughout the health processor cycle to reduce trapped space fluctuations. (SWAR-8892)

    • A cluster can now be configured for faster volume defragmentation in cases where a backlog of disk fragmentation arises from small object turnover. Contact DataCore Support to get it configured if the cluster benefits from such a change. (SWAR-6472)

    • Swarm log messages are now tagged with unique reporting codes that facilitate troubleshooting when working with DataCore Support. (SWAR-7761)

  • Settings Updates

    • The setting scsp.enableVolumeRedirects is now a persisted cluster setting. (SWAR-9003)

    • The setting recovery.autoSuspendMissingHintedVolumes is added to allow Support to automatically suppress false FVRs for unknown volumes that may be impacting client performance. (SWAR-9067)

  • Network Interface Details in Diagnostics Menu: The Diagnostics Menu in the system menu has additional functionality for viewing the mapping of NIC names to real MAC addresses. The new option is under #6: Network Interface Details. (SWAR-9033)

  • Preserve Settings During an Upgrade: The configure_elasticsearch_with_swarmsearch script now preserves several settings such as path.data and network.host from before the upgrade. (SWAR-9034)

  • Trigger Maintenance Mode for System Console Shutdown/Reboot: Restarting a Swarm node from the system menu (hardware console) now triggers maintenance mode and unifies the reboot behavior across the system menu, UI, and SNMP. (SWAR-8393)

Additional Changes

These items are other changes, including those that come from testing and user feedback.

OSS Versions

See Third-Party Components for Storage 12.1 for the complete listing of packages and versions for this release.

  • Linux kernel is updated to 5.4.109 (SWAR-8771)

  • Kernel firmware drivers are updated to 2021-03-03 (SWAR-8771)

  • Intel ixgbe network kernel driver is updated to 5.11.3 (SWAR-8771)

  • Prometheus Node Exporter updated to 1.1.2 (SWAR-9130)

  • Debian operating system updates included (SWAR-9027)

Fixed in 12.1

  • Retiring a Volume: The retire of a volume can become stalled by a feed in a paused state. (SWAR-9096)

  • Elasticsearch Record Cleanup: Elasticsearch records for named streams sometimes persist even though they are deleted in Swarm using recursive delete of the containing bucket or domain. (SWAR-9095)

  • Delay in configure_elasticsearch_with_swarm_search.py Script: Lack of internet access results in a delay in the configure_elasticsearch_with_swarm_search.py script. This is addressed; it now installs the Prometheus plugin to allow monitoring via a Grafana dashboard if internet access is available, and skips the step if internet access is not available. (SWAR-9078)

  • Issue with Creating Untenanted Objects: An issue in Swarm 9.0 – 12.0 prevented the creation of untenanted objects when the DNS hostname of the Gateway matched a storage domain, even if the request explicitly said not to use a domain. An empty domain query argument always forces untenanted operations. (SWAR-9074)

  • Progress Stalled After a Prolonged Outage: A feed (search, replication, S3 backup) stops making progress after a prolonged outage and a node reboot is required to resume progress. This is resolved in 12.1. (SWAR-9062)

  • Internal 404 Not Found and Other Errors: The SCSP error counter had erroneously been including internal errors unrelated to client activity. The SCSP error stat in SNMP, metrics, and the management API now includes client requests. (SWAR-9043)

  • Eliminated Unnecessary Feed Refreshes: Editing a search feed via Storage UI no longer triggers an unnecessary refresh of the feed. (UIS-1073 and SWAR-9024)

  • URL Encoding of Special Characters: Characters like "<" and ">" in Swarm redirects and location headers are now properly URL-encoded. (SWAR-9023)

  • Hanging Feed SEND Requests: A feed SEND request, such as those used by Remote Synchronous Write (RSW), can hang indefinitely instead of returning an error if the feed changed to a blocked state during the request. (SWAR-9019)

  • Simultaneous Domain and Bucket Creation via POST: Simultaneous domain and bucket creation via POST is now prevented. Only one of these requests responds with a 201 Created response. The other requests get either a 409 Conflict or 503 Service Unavailable response. SWAR-3421)

  • Improved Behavior for Blocked Feeds Watch Item: A clearer error message draws attention to the issue for S3 backup and replication feeds that are blocked due to invalid X.509 ("SSL") certificates. (SWAR-8996)

  • Error When Attempting to Edit Search Feeds: An error popup within the UI mentioning "respondsToLists" can appear when editing and saving a search feed. (SWAR-9065)

  • Invalid or Expired Swarm Licenses: Swarm does not boot if the configured license was invalid or expired. (SWAR-9050)

Upgrade Impacts

Required

Complete the migration to Swarm 11.3 and ES 6.8.6 before upgrading to Swarm 12 if on older Elasticsearch (5.6.12 or 2.3.3). See How to Upgrade Swarm, Upgrading from Unsupported Elasticsearch.

These items are changes to the product function that may require operational or development changes for integrated applications. Address the upgrade impacts for each of the versions since the one currently upgrading from:

Impacts for 12.1

  • Change in the node.cfg File: The previously deprecated sysctl section of the node.cfg file is removed. Use kernel.sysctlFileUrl (introduced in Swarm 12.0) instead if it is necessary to set kernel runtime parameters. (SWAR-8968)

    Settings changes

    • Updated:

      • All sysctl.* settings are removed. (SWAR-8968)

      • support.reportPeriod default is changed to 21600 (6 hours). (SWAR-8424)

  • Swarm storage node metrics are deprecated and are replaced in the next major release by the graphs and reporting from Prometheus Node Exporter and Grafana. The storage administration UI is updated to allow for metrics to be turned off.

  • Differences in scsp.forceLegacyNonce configuration depending on the version upgrading from (SWAR-9020):

  • Currently running a Swarm Storage version prior to 11.1, and upgrading to 11.1, 11.2, 11.3, 12.0 or 12.1:

    Before upgrading, set scsp.forceLegacyNonce=true in the node.cfg file. After the upgrade, when the cluster is fully up, update scsp.forceLegacyNonce=false using swarmctl and change scsp.forceLegacyNonce=false in the node.cfg file.

    Currently running a Swarm Storage version 11.1, 11.2, 11.3, 12.0 or 12.1 and upgrading to another version from that list:

    Before upgrading, verify scsp.forceLegacyNonce=false is in the node.cfg file and verify using swarmctl that scsp.forceLegacyNonce=false in the cluster.

Use swarmctl to Check or Change Settings

Use 'swarmctl -C scsp.forceLegacyNonce' to check the value of scsp.forceLegacyNonce.

Use 'swarmctl -C scsp.forceLegacyNonce -V False' to set the value to false.

For more details, see https://support.cloud.caringo.com/tools/Tech-Support-Scripts-Bundle-swarmctl.pdf.

 

Cumulative Impacts

Address all upgrade impacts for each version released since the version upgrading from.

Review the comprehensive Upgrade Impacts listed for the Swarm Storage 11.3 Release.

Watch Items and Known Issues

The following watch items are known:

  • A node fails to mount all disks in the node if a node mounts an encrypted volume that is missing the encryption key in the configuration. (SWAR-8762)

  • S3 Backup feeds do not backup logical objects greater than 5 GB; those writes fail with a CRITICAL log message. (SWAR-8554)

  • The chassis shuts down but does not come back up when restarting a cluster of virtual machines that are UEFI-booted (versus legacy BIOS). (SWAR-8054)

These are standing operational limitations:

  • The Storage UI shows no NFS config if the Elasticsearch cluster is wiped. Contact DataCore Support for help repopulating the SwarmFS config information. (SWAR-8007)

  • Any incomplete multipart uploads into a bucket leaves the parts (unnamed streams) in the domain if a bucket is deleted. To find and delete them, use the s3cmd utility (search the Support site for "s3cmd" for guidance). (SWAR-7690)

  • Removing subcluster assignments in the CSN UI creates invalid config parameters that prevent the unassigned nodes from booting. (SWAR-7675)

  • You may see false 404 Not Found and other SCSP errors during rolling reboot in versions 11.1 through 12.0.1. To mitigate this problem, set scsp.forceLegacyNonce=False in the cluster configuration. Remove this setting before upgrading to 12.1.0 or later. (SWAR-9020)

  • During a node reboot, such as a rolling reboot of the cluster, a newly booted node can temporarily return an empty result set for a listing query. (SWAR-9083)

  • S3 Backup restoration to the cluster may be blocked if the certificate is not located where Swarm expects it when using certificates with HAProxy. From 12.1, a clearer error message draws attention to the issue for S3 backup and replication feeds that are blocked due to invalid X.509 ("SSL") certificates. (SWAR-8996)

To upgrade Swarm 9 or higher, proceed now to How to Upgrade Swarm. Contact DataCore Support for guidance if migrating from Swarm 8.x or earlier.

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.