Swarm Storage 11.1 Release
New Features
Grafana Dashboards for Swarm Monitoring: To offer sophisticated visualization of the Prometheus Node Exporter and related Swarm data, DataCore publishes public Grafana dashboards for monitoring Swarm implementations. To see the latest dashboards for Swarm products and features, search the dashboards for Caringo
: https://grafana.com/grafana/dashboards?search=caringo. See Prometheus Node Exporter and Grafana.
Customized dashboards are available for Swarm System Monitoring, with separate dashboards for specific to versions of Swarm Storage, starting with 10.2. The detailed dashboard covers cluster health, capacity, indexing, licensing, temperature, and network and CPU loads, as well as cluster-wide operations:
The Prometheus Node Exporter produces a totaled version of each of the SCSP-related statistics (appending
_total
to the original name), to capture counts in addition to aggregate rates. These totaled statistics for Swarm HTTP operations and responses are incorporated into the Grafana dashboard for Swarm 11.1. (SWAR-8710)
Elasticsearch 6: Swarm supports and ships with Elasticsearch 6, which is a version allowing upgrades-in-place (without reindexing) going forward several releases. Both ES2 and ES5 is deprecated in the next release. Create a new ES6 cluster, add a new search feed (to reindex metadata), and switch over to it when the reindexing is complete to migrate from either ES2 or ES5. See Migrating from Older Elasticsearch.
Upgrade to Python 3: All Swarm Storage usage of Python 2 is uniformly upgraded to Python 3, which brings a small performance boost, up to 20% improvement for high loads. (SWAR-8143)
Modernization: Extensive work has modernized the Linux kernel to Debian 10 and its drivers and components, which allowed for comprehensive updates across Swarm's third-party tools and dependencies. See Third-Party Components for 11.1 for the complete listing. (SWAR-8664)
Administration Improvements - This release includes several changes to make it easier to monitor and manage Swarm:
Swarm has improved handling of slashes in object naming to prevent unintended naming and renaming errors. Leading slashes are always removed, and trailing spaces are removed from bucket names. Trailing slashes in domain names cause 404 errors, but trailing slashes are valid for named objects, so they are retained. (SWAR-8706)
Multipart writes are long-running operations that provide an initial 202 Accepted response and a later 201 Created response, on completion. For S3 compatibility, the initial response now includes a Completion-Etag with the value of the expected ETag. If there is an error, there is no new object, and the expected ETag provided is not valid. (SWAR-8694)
For a multipart object, to copy from a start range to the end of the object, do so by omitting the range end. This avoids the risk of the end value extending beyond the size of the object being copied, which results in a 416 Range Not Satisfiable response. (SWAR-8675)
Logging of disk diagnostics (such as dmesg and SMART data) now covers volume retires that are due to I/O device errors, in addition to volume failures. (SWAR-8665)
Swarm 11.1 has improved volume health monitoring and alerting to surface overly long I/O request times that may be an indication of a volume nearing its end of life. (SWAR-8585)
When returning a list of drives via the management API (/api/storage/chassis/*/drives), Swarm now returns both the drive name (such as /dev/sdd) and the volume's UUID. (SWAR-8637)
Replication feed handling now generates more accurate state reporting and helpful status descriptions, to support diagnosis of blocked feeds. (SWAR-8660)
All Swarm Management API endpoints that required specifying the cluster name now accept "_self" to refer to the local cluster, which eases formation of the call. (SWAR-8636)
Error messaging now clarifies when an attempt to update a Swarm setting via the API has failed because the setting is read-only. (SWAR-8443)
Swarm no longer ignores erroneous use of the "format" query argument on a non-listing request (a request other than GET or HEAD). Swarm now returns a 400 Bad Request error. (SWAR-8598)
The retired setting
cluster.settingsUuid
is now ignored by Swarm, which guarantees obsolete values do not prevent Swarm from booting. (SWAR-8535)
Additional Changes
These items are other changes and improvements including those that come from testing and user feedback.
OSS Versions
See Third-Party Components for 11.1 for the complete listing of packages and versions.
The Linux kernel is upgraded to 4.19.84. (SWAR-8664)
Linux firmware is upgraded to 1.183.2. (SWAR-8664)
Fixed in 11.1.0
Persisted settings, including security.administrators, may not update properly when the persisted settings object was read at startup. This issue mostly affected chassis with encrypted volumes or more than 6 volumes. (SWAR-8800)
With Elasticsearch 5, listing a bucket or domain with fields=all and format=json receives a response with invalid JSON. (SWAR-8781)
Premature closes of EC object reads sometimes cause abnormal memory usage and critical errors. (SWAR-8709)
Read failures (500: ZeroDivisionError) can occur with small range reads near the end of EC objects, for certain encodings. (SWAR-8661)
In versions 10.x-11.0 are used with ES 5.6, deprecation warnings caused logs to consume excessive disk space. (SWAR-8632)
Unnamed objects can appear in listings even after they are deleted. (SWAR-8623)
Under some conditions, Swarm may start without mounting some of its volumes. (SWAR-8597)
Upgrade Impacts
Use the supported versions of Swarm components for the target version of Elasticsearch:
Elasticsearch 6.8.6 | Swarm Storage 11.1 | Gateway 6.3 | SwarmFS 2.4 | Recommended configuration. |
Elasticsearch 5.6.12 | Swarm Storage 10.0 - 11.1 | Gateway 6.0 - 6.3 | SwarmFS 2.4 | Plan to migrate to Elasticsearch 6. |
Elasticsearch 2.3.3 | Swarm Storage 9.6 - 11.1 | Gateway 5.4 | SwarmFS 2.1 |
These items are changes to the product function that may require operational or development changes for integrated applications. Address the upgrade impacts for each of the versions since the one currently running.
Impacts for 11.1
Upgrading Elasticsearch: Use Elasticsearch 5.6.12/2.3.3 with Storage 11.1 if moving to ES 6 immediately is not possible, but start the migration now (see Migrating from Older Elasticsearch). Support for ES 5.6.12/2.3.3 ends in a future release, and testing for 2.3.3 with Swarm 11 is discontinued. Important: Always upgrade Swarm Search and Metrics at the same time ES is upgraded. Do not run an ES 5 Search or Metrics Curator against ES 6.
Swarm Search and Metrics: This release includes new versions of Swarm Search and Metrics RPMs. Both require Python 3 to be installed on the ES servers they run on.
For Swarm Metrics on RHEL/CentOS 7.7, first install this dependency:
yum install epel-release
Python 3: Install Python 3 if is not automatically installed with RHEL/CentOS 7.
Propagate Delete Removed: For Replication Feeds, the Propagate Deletes option is removed from the legacy Admin Console and the Management API (propagateDeletes, nodeletes fields). (SWAR-8609, SWAR-8615)
Swarm Configuration: Run the Storage Settings Checker before upgrading to this version, to identify configuration issues.
The Storage Settings Checker now requires Python 3 to be installed. (SWAR-8742)
crier.deadVolumeWall has been unpublished for reimplementation. (SWAR-8640)
S3 Backup Restore: The S3 Backup Restore Tool has been migrated to Python 3.6. If the tool is installed, uninstall it and install the new version. (SWAR-8703)
Upgrade Process: During the upgrade to 11.1, it may not be possible to monitor the cluster via the Swarm UI. Workaround: Use the legacy Admin Console (port 90) during upgrade. (SWAR-8716)
Differences in
scsp.forceLegacyNonce
configuration depending on the version being upgraded from (SWAR-9020):If currently running a Swarm Storage version prior to 11.1 and upgrading to 11.1, 11.2, 11.3, 12.0 or 12.1:
Before upgrading, set
scsp.forceLegacyNonce=true
in thenode.cfg
file. After the upgrade, when the cluster is fully up, updatescsp.forceLegacyNonce=false
usingswarmctl
and changescsp.forceLegacyNonce=false
in thenode.cfg
file.If currently running a Swarm Storage version 11.1, 11.2, 11.3, 12.0 or 12.1 and upgrading to another version from that list:
Before upgrading, verify
scsp.forceLegacyNonce=false
is in the node.cfg file and verify usingswarmctl
thatscsp.forceLegacyNonce=false
in the cluster.
Use swarmctl to Check or Change Settings
Use 'swarmctl -C scsp.forceLegacyNonce'
to check the value of scsp.forceLegacyNonce
.
Use 'swarmctl -C scsp.forceLegacyNonce -V False'
to set the value to false
.
For more details, see https://support.cloud.caringo.com/tools/Tech-Support-Scripts-Bundle-swarmctl.pdf.
Watch Items and Known Issues
The following operational limitations and watch items exist in this release.
Infrequent WARNING messages, "Node/Volume entry not published due to lock contention (...); action is retried," may appear in logs. Unless they are frequent, they may be ignored. (SWAR-8802)
If a node mounts an encrypted volume that is missing the encryption key in the configuration, the node fails to mount all disks in the node. (SWAR-8762)
S3 Backup feeds do not back up logical objects greater than 5 GB. (SWAR-8554)
If downgrading from Swarm 11.0, CRITICAL errors may appear on the feeds. To stop the errors, edit the existing feed definition names via the Swarm UI or legacy Admin Console. (SWAR-8543)
When restarting a cluster of virtual machines that are UEFI-booted (versus legacy BIOS), the chassis shut down but do not come back up. (SWAR-8054)
If Elasticsearch cluster is wiped, the Storage UI shows no NFS config. Contact DataCore Support for help repopulating the SwarmFS config information. (SWAR-8007)
If a bucket is deleted, any incomplete multipart upload into that bucket leaves its parts (unnamed streams) in the domain. To find and delete them, use the s3cmd utility (search the Support site for "s3cmd" for guidance). (SWAR-7690)
Logs showed the error "FEEDS WARNING: calcFeedInfo(etag=xxx) cannot find domain xxx, which is needed for a domains-specific replication feed". The root cause is fixed; if receiving such warnings, contact DataCore Support so the issue can be resolved. (SWAR-7556)
Note these installation issues:
The elasticsearch-curator package may show an error during an upgrade, which is a known curator issue. Workaround: Reinstall the curator:
yum reinstall elasticsearch-curator
(SWAR-7439)Do not install the Swarm Search RPM before installing Java. If Gateway startup fails with "Caringo script plugin is missing from indexer nodes", uninstall and reinstall the Swarm Search RPM. (SWAR-7688)
Upgrading Swarm
Proceed to How to Upgrade Swarm to upgrade Swarm 9 or higher.
Important
Contact DataCore Support for guidance if needing to upgrade from Swarm 8.x or earlier.
© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.