How Swarm settings work, precedence and dynamically changing them.

Because Swarm tries to be as exhaustive as possible in covering use cases with various scenarios, the way settings are set and then changed is pretty complex. This is an attempt at explaining how the settings mechanism works.

Before getting into the specifics of Swarm settings a clear distinction must be made between "Cluster Level Parameters" and "Node Level Parameters".

Swarm Cluster Level Parameters, are attributes that affect the cluster in its entirety and that once set will propagate from node to node to the entire cluster. From the SNMP perspective, in the CARINGO-CASTOR-MIB.txt file there is a MIB group called "clusterConfig" that contains all cluster-level settings (they can be read-only or read/write). The API is a little bit more complete and it will show Cluster Level settings at the following endpoint: "http://<Swarm_Node_IP>:91/api/storage/clusters/<Swarm_Cluster_Name>/settings" (where <Swarm_Node_IP> and <Swarm_Cluster_Name> need to be replaced with the actual Swarm Node IP address and the actual Swarm Cluster Name).

Swarm Node Level Parameters were intended to be node specific settings and only affect the node where the setting has been set (it will not automatically propagate to the rest of the nodes in the cluster). From the SNMP perspective, in the CARINGO-CASTOR-MIB.txt file there is a MIB group called "config" that contains all node-level settings (they can be read-only or read/write). The API will show Node Level settings at the following endpoint: "http://<Swarm_Node_IP>:91/api/storage/nodes/<Swarm_Node_Unique_ID>/settings" (where <Swarm_Node_IP> and <Swarm_Node_Unique_ID> need to be replaced with the actual Swarm Node IP address and the actual Swarm Node Unique ID). For clarity, the Swarm Node Unique ID can be retrieved from "http://<Swarm_Node_IP>:91/api/storage/nodes".

The baseline configuration file is node.cfg/cluster.cfg (depending if you are using a CSN or not). In this configuration file, one sets the parameters which are necessary for the cluster to boot for the first time (like cluster.name, timeSource, vols, group, etc).

Once the cluster has first booted up any cluster level parameter dynamic change (be it through the Swarm UI, Legacy UI, SNMP command or API command) will trigger the creation of the Persistent Settings Stream (PSS). Shutting down or rebooting a node or the entire cluster will also trigger the creation of the PSS (if no other operations to create it are done beforehand). The PSS is stored in Swarm and, as the name suggests, it persists across reboots.

The purpose of the PSS is to provide a mechanism to configure certain settings (cluster level settings) on all nodes in the cluster at once and to provide those same settings to any new nodes entering the cluster.  As Swarm code has evolved, more and more settings that were previously only in the cluster.cfg/ node.cfg file have also been placed in the PSS which allows for dynamic settings changes without requiring a reboot. Any settings that exist in the PSS will override commensurate settings in the cluster.cfg/ node.cfg file as soon as the node finishes initializing during boot time. However, by design, not all Swarm cluster level settings will make their way into the PSS, so the node.cfg/cluster.cfg file is still relevant.

Dynamic changes (changes that are operated during Swarm runtime, which do not require a reboot) can be performed via SNMP commands (a list of SNMP changeable parameters can be seen by opening https://support.cloud.datacore.com/tools/CARINGO-CASTOR-MIB.txt - only settings that are read/ write in the MIB file can be changed ) or via API calls (the Swarm UI also does API calls in the background).


Dynamic Cluster Level Parameter changes that are performed via SNMP commands are propagated from node to node to the entire cluster and also make their way automatically to the PSS. To change a SNMP parameter (in this case the Erasure Coding schema to "4:2") for the entire cluster, it is sufficient to run the following from the support tools directory:

./snmp-castor-tool.sh -C policyECEncoding -V "4:2"

The "snmp-castor-tool.sh" script is part of Caringo's support bundle downloadable here: swarm-support-tools.tgz Please check the script's help for more options.

One could also change a Node Level Parameter's value via SNMP commands. In this example the log level is changed to "10" only for node 192.168.209.85. Rebooting the chassis to whom 192.168.209.85 belongs will cause the value of the log level for node 192.168.209.85 to be reset to the default value (it does not persist across reboots).

./snmp-castor-tool.sh -L 10 -V 192.168.209.85


For API calls, as mentioned above, there are two API endpoints for changing settings: a Cluster Settings endpoint and a Node Settings endpoint. Cluster settings will also be automatically propagated to the PSS and all nodes will be notified of the change, while node level settings will be node/chassis specific and will require manual propagation to the rest of the cluster. Also, node level settings will not persist across reboots of a chassis (or of the entire cluster) therefore changing the value of a node level setting across the entire cluster through API calls needs to be doubled by manually changing that same parameter in the node.cfg/cluster.cfg file for the change to be picked-up from these configuration files after a reboot.

To make API calls easier to work with, we have created 2 scripts: swarmctl.py and getsetAPIParam.sh. The first one, swarmctl.py, will require the python modules named "restnavigator" and "requests".

Here's an example on how to set the same parameter "policy.eCEncoding" with the help of "swarmctl.py":

./swarmctl.py policy.eCEncoding 4:2 -u admin:caringo

where "admin" is the Swarm admin username and "caringo" is the Swarm admin password.

And here is an example on how to list all the Swarm parameters that are available through the API:

./swarmctl.py -a 192.168.209.84

where "192.168.209.84" is a Swarm node IP address.


Here's an example on how to set parameter "policy.eCEncoding" with the help of "getsetAPIParam.sh":

./getsetAPIParam.sh -a policy.eCEncoding -v 4:2

where the script is using the Swarm administrator username and password from the config files.

And here is an example on how to list all the Swarm parameters that are available through the API using "getsetAPIParam.sh":

./getsetAPIParam.sh


Please check the help menu for both "swarmctl.py" and "getsetAPIParam.sh" for more options. Both scripts are part of the same Caringo support bundle as "snmp-castor-tool.sh".


© DataCore Software Corporation. · https://www.datacore.com · All rights reserved.