Swarm Storage 15.3 Release

Additional Changes

Changes include versions and fixes coming from testing and user feedback:

OSS Versions

See  for the complete listing of packages and versions for this release.

Fixed in 15.3

  • Improved Stability for Dense Clusters: Improved CPU utilization in dense clusters which reduces spontaneous process restarts and other instabilities. (CUP-630)

  • Improved Handling of S3 Multi-Deletes: Addressed a multi-delete ES ghost entry bug impacting various Veeam-using customers. (SWAR-9703)

  • EC Multipart and Versioned Segments: Eliminated the potential for missing EC segments in obsolete EC object versions. (SWAR-9770)

  • Finalizing State on Restart/Shutdown: Fixed an issue where a node did not finalize the state during restart or shutdown. Instead, the node required manual power for cycling, which should no longer be necessary. (SWAR-9767)

  • Network Disruption: Eliminated potential for data loss when a cluster had significant network disruption that continued for multiple HP cycles. (SWAR-9753)

  • EC Conversion: Fixed a rare cause of segment loss when EC conversions interleaved with volume recoveries. (SWAR-9751)

  • Elasticsearch Upgrade: Swarm Storage 15.3 requires upgrading Elasticsearch to 7.17.9. The rolling upgrade of the Elasticsearch cluster can be done either before or after upgrading to Storage 15.3. (SWAR-9731)

  • Retention Policies: Fixed an issue where updates to retention policies that failed, typically due to network-related issues, could result in the loss of the object. (SWAR-9842)

Watch Items and Known Issues

The following watch items are known:

  • Several settings that are persistent in the stream of the cluster’s persistent settings have their defaults changed in this version. It is recommended to review these settings and make appropriate adjustments.

    • health.examDelay now defaults to 0.19. It is recommended not to use a value lower than the default.

    • health.fvrPushDelay has a new default of 0.7, which is recommended for most clusters.

    • power.savingMode has a new default of False, which is recommended for most clusters.

    • scsp.defaultSynchronousIndexWait has a new default of 60, which is recommended for most clusters. This is a non-persistent setting.

  • Configuring elasticsearch.yml's network.host (https://www.elastic.co/guide/en/elasticsearch/reference/7.16/important-settings.html#network.host) to "__site__" might not choose the right IP to allow master election if the server is multi-homed. Modify elasticsearch.yml to enter a specific IP for the node, and the configuration script will preserve it. (SWAR-9350)
    If you run into this issue, the fix is to:

    • systemctl stop elasticsearch on all ES nodes

    • remove all the contents of the path.data directory

    • change network.host: <IP of ES NIC in the Storage VLAN>

    • systemctl start elasticsearch

  • Elasticsearch can fail to start and return a warning "unable to load JNA native support library", which is due to SELinux setting “noexec” on /tmp.
    For Elasticsearch 7.5.2, edit “/etc/elasticsearch/jvm.options” replacing the line "-Djava.io.tmpdir=${ES_TMPDIR}" with "-Djava.io.tmpdir=/var/log/elasticsearch". With Elasticsearch 7.17, uncomment the "Environment=ES_TMPDIR=/usr/share/elasticsearch/tmp" line in /etc/systemd/system/elasticsearch.service.d/override.conf and create that directory. (SWAR-9347)

  • Swarm versions 10.0 onward are vulnerable to kernel issues manifested on some Intel CPUs. Symptoms include lowered performance, long mount times, and cluster instability. Swarm versions 14.1 and later provide a workaround for this issue, see https://caringo.atlassian.net/wiki/spaces/KB/pages/2973204604. (SWAR-9055)

  • Customers who perform paginated listing queries (using sort and marker) need to choose a unique set of fields to return the complete results. (SWAR-9630)

  • Writes of objects into versioned buckets suffer a small (constant) performance penalty. This only applies to the 15.3.0 release. (SWAR-9794)

  • The search configuration utility was upgraded to caringo-elasticsearch-search-7.1.0-1.noarch.rpm and Elasticsearch 7.17.9 is now the default version installed. If you are running Elasticsearch 7.5.2, do not install the 7.17.9 rpm directly. Instead, run the latest configuration script configure_elasticsearch_with_swarm_search.py. The configuration script upgrades /etc/sysconfig/elasticsearch, so that the default Elastic-bundled JDK is used. (SWAR-9159)

Note

Elasticsearch cannot be downgraded back to 7.5.2.

  • The 15 series releases can exhibit a known issue when attempting to hot plug drives into a Swarm node that uses the Broadcom/LSI HBA driver based on the kernel version in the release. Customers may experience situations where the hot plug of a known good drive fails, requiring a node restart to allow the drive to be recognized. (SWAR-9873)

These are standing operational limitations:

  • The Storage UI shows no NFS config if the Elasticsearch cluster is wiped. Contact DataCore Support for help in repopulating the SwarmFS config information. (SWAR-8007)

  • Any incomplete multipart upload into a bucket leaves the parts (unnamed streams) in the domain if the bucket is deleted. To find and delete those parts, use the s3cmd utility (search the Support site for "s3cmd" guidance). (SWAR-7690)

  • The chassis shuts down but does not come back up when restarting a cluster of virtual machines that are UEFI-booted (versus legacy BIOS). (SWAR-8054)

  • Invalid config parameters that prevent the unassigned nodes from booting are created if subcluster assignments are removed in the CSN UI. (SWAR-7675)

  • On node reboots, some feed statistics that are not persisted across boots may show up incorrectly. It will fix itself eventually. Currently, there is no workaround for this. (SWAR-9720)

To upgrade Swarm 9 or higher, proceed to . For migration from Swarm 8.x or earlier, contact DataCore Support for guidance.

Instructions for rpm v15.2 and above on CSN

The user must follow the below steps if using rpm version 15.2 or above on the CSN:

  1. Edit the /etc/caringo/netboot/netboot.cfg file on the CSN.

  2. Verify that the KernelOptions parameter includes the new maximum size for the ramdisk.

    kernelOptions = castor_net=active-backup: ramdisk_size=190000

    Use a space separator between “active-backup:” and ramdisk_size=190000 as used in the above command.

  3. Restart netboot.
    service netboot restart

 

© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.