Swarm Storage 15.2 Release

Additional Changes

Changes include versions and fixes coming from testing and user feedback:

OSS Versions — See Third-Party Components for Storage 15.2 for the complete listing of packages and versions for this release.
Fixed in 15.2
- Restarting behavior — Improved the behavior of restarting internal processes on failure, reducing the need for manual node restarts. (SWAR-9670)
- Cluster performance — Reduced disk utilization when feeds are in use, possibly improving cluster performance. (SWAR-9659)
- Skylake perf vulnerability detection — Improved the detection of CPUs that might be impacted by a kernel-level performance vulnerability. (SWAR-9485)
- Multi-version deletion — Fixed an issue where rapid deletes of multiple versions of the same object sometimes leave an Elasticsearch record for the removed object. (SWAR-9625)
- Performance — Improved the GET performance; it is 10% better than the previous release (15.1.0). (SWAR-9660)

The following watch items are known:

Configuring elasticsearch.yml's network.host (https://www.elastic.co/guide/en/elasticsearch/reference/7.16/important-settings.html#network.host) to "__site__" might not choose the right IP to allow master election if the server is multi-homed. Modify the elasticsearch.yml to enter a specific IP for the node and the configuration script will preserve it. (SWAR-9350)
If you run into this issue, the fix is to:
- systemctl stop elasticsearch on all ES nodes
- remove all the contents of the path.data directory
- change network.host: <IP of ES NIC in the Storage VLAN>
- systemctl start elasticsearch
Verify the configured “java.io.tmpdir” in “jvm.options” is writable to Elasticsearch for customers using Elasticsearch instances that fail to start with JNA warnings in Elasticsearch logs. Change “java.io.tmpdir” to /var/log/elasticsearch as per desired security preferences. (SWAR-9347)
Swarm versions 10.0 onward are vulnerable to kernel issues manifested on some Intel CPUs. Symptoms include lowered performance, long mount times, and cluster instability. Swarm versions 14.1 and later provide a workaround for this issue, see https://caringo.atlassian.net/wiki/spaces/KB/pages/2973204604. (SWAR-9055)
Customers who perform paginated listing queries (using sort and marker) need to choose a unique set of fields to return the complete results. (SWAR-9630)

These are standing operational limitations:

The Storage UI shows no NFS config if the Elasticsearch cluster is wiped. Contact DataCore Support for help in repopulating the SwarmFS config information. (SWAR-8007)
Any incomplete multipart upload into a bucket leaves the parts (unnamed streams) in the domain if a bucket is deleted. To find and delete those parts, use the s3cmd utility (search the Support site for "s3cmd" guidance). (SWAR-7690)
The chassis shuts down but does not come back up when restarting a cluster of virtual machines that are UEFI-booted (versus legacy BIOS). (SWAR-8054)
Invalid config parameters that prevent the unassigned nodes from booting are created if subcluster assignments are removed in the CSN UI. (SWAR-7675)

To upgrade Swarm 9 or higher, proceed to How to Upgrade Swarm. For migration from Swarm 8.x or earlier, contact DataCore Support for guidance.

Instructions for rpm v15.2 and above on CSN

The user must follow the below steps if using rpm version 15.2 or above on the CSN:

Edit /etc/caringo/netboot/netboot.cfg file on the CSN.
Verify that KernelOptions parameter includes the new max size for the ramdisk.
```
kernelOptions = castor_net=active-backup: ramdisk_size=190000
```
Use space separator between “active-backup:” and ramdisk_size=190000 as used in above command.
restart netboot.
service netboot restart