...
Full reboot notifies every chassis in the cluster to reboot itself at the same time. The entire cluster is temporarily offline as the chassis reboot.
Full reboot
Code Block language bash platform restart storagecluster --full
Rolling reboot is a long-running process that keeps the cluster operational by rebooting the cluster one chassis at a time, until the entire cluster has been is rebooted. A rolling reboot includes several options, such as to limit the reboot to one or more chassis:
Rolling reboot
Code Block language bash platform restart storagecluster --rolling [--chassis <comma-separated system IDs>] [--skipConnectionTest] [--skipUptimeTest] [--continueWithOfflineChassis] [--stopOnNodeError]
Info |
---|
RequirementsBefore a rolling reboot can begin, these conditions must be met:
|
Managing Rolling Reboots
You have 10 seconds are allotted to cancel a rolling reboot before it begins. Once a rolling reboot has started, it stops and reports an error the following occur:
A chassis is offline when it is selected for reboot. To have the reboot process ignore currently offline chassis, add the flag
--continueWithOfflineChassis
.The reboot process continues if the volumes come up but a node goes into an error state. To have the reboot process stop, add the flag
--stopOnNodeError
.If the chassis boots with a number of volumes that doesn't does not match the number present before the chassis was rebooted. A volume is considered up if it has a state of: ok, retiring, retired, or unavailable
The chassis does not come back online after 3 hours has passed.
If a rolling reboot has stopped due to an error, resume the reboot using the resume
command below after you have resolved the error is resolved .
Status check — To retrieve the status of a rolling reboot task, use the following commands for reboots remaining and reboots completed:
...
in-progress: The rolling reboot is currently running.
paused: The rolling reboot has been is paused (using the
pause
command).completed: The rolling reboot finished successfully.
cancelled: The rolling reboot was caused per a user request.
error: The reboot has been is stopped due to an error of some kind.
...
pending: The rolling reboot task has yet to process not processed the chassis.
in-progress: The rolling reboot task is in the process of rebooting the chassis.
completed: The chassis was successfully rebooted.
removed: The chassis was removed from the list of chassis to process after the rolling reboot was started (using the
delete rolling reboot
command).error: The chassis encountered an error of some kind.
abandoned: The chassis was currently being processed when a user cancelled the rolling reboot.
dropped: The rolling reboot was in the process of waiting for the chassis to reboot when a user request was made to move to the next chassis (using the
--skip
flag).offline: The chassis was already offline when the reboot task attempted to reboot the chassis.
...
Exclude from reboot — To exclude from a currently running rolling reboot one or more chassis that have not yet been rebooted:
Code Block | ||
---|---|---|
| ||
platform delete rollingreboot --chassis <comma-separated system IDs> |
Pause reboot — To pause the current rolling reboot process so that it can be restarted later:
...
Create a node.cfg file and add any node-specific Swarm settings to apply, or leave it blank to accept all current settings.
Power on the chassis for the first time.
Wait until the chassis enlists and powers off.
Deploy the new server:
Code Block language bash platform deploy storage -n 1 -v <#.#.#-version-to-deploy>
To Use the following process to deploy an individual chassis by system ID, use this process:
Create a node.cfg file and add any node-specific Swarm settings to apply , or leave it blank to accept all current settings.
Get a list of chassis that are available for deployment by using the following command:
Code Block language bash platform list nodes --state New
Choose a System ID to deploy a single chassis using a command like the following:
Code Block language bash platform deploy storage -y 4y3h7p -v 9.2.1
Service Proxy
If Restart the service so it picks up the new chassis list if the Service Proxy is running on the Platform Server when adding or removing chassis, be sure to restart the service so that it can pick up the new chassis list:
...
:
Code Block | ||
---|---|---|
| ||
platform restart proxy |
Reconfiguring the Cluster
You can modify Modify the cluster-wide Swarm configuration at anytime using the CLI and a configuration file. The reconfiguration process is additive: all existing settings that are not referenced in the file are preserved. That is, if you define only two settings, Platform overwrites or adds only those two settings if two settings are defined.
Create a supplemental .cfg file (such as
changes.cfg
) and specify any new or changed Swarm settings to apply.To upload the configuration changes, use the following CLI command:
Code Block language bash platform upload config -c {Path to .cfg}
The CLI parses the uploaded configuration file for changes to make to Platform.
If Swarm was running during the upload, Platform Server attempts to communicate the new configuration to Swarm if Swarm was running during the upload. Any settings that cannot be communicated to Swarm requires a reboot of the Swarm cluster in order to take effect. For each setting contained in the file, the CLI indicates if the setting The CLI indicates if the setting was communicated to the Storage cluster and if a reboot is required for each setting contained in the file. The Swarm UI also indicates which settings require rebooting.
...
Add the configuration change directly:
Code Block language text platform add config --name "chassis.processes" --value 6
Reconfiguring a Chassis
You can modify Modify the node-specific settings for a single chassis by the same process, but you need to specify the MAC address of any valid NIC on that chassis needs to be specified.
Create a .cfg file (such as
changes.cfg
) and specify any new or changed node-specific settings to apply.To upload the configuration changes, use the following CLI command:
Code Block language bash platform upload config -c {Path to .cfg} -m {mac address}
...
Releasing a Chassis
There may be times when you need to release a chassis needs to be released from the Swarm cluster, either for temporary maintenance or for permanent removal.
Info |
---|
ImportantTo guarantee a clean shut down, power off the chassis through the UI or SNMP before you run running |
Temporary release — Temporary release of a chassis assumes that the chassis is added back into the cluster at a later time. Releasing a chassis allows deallocating the cluster resources, such as IP Addresses, or wipe and reset the configuration.
...
Permanent removal — Permanent removal is for retiring a chassis altogether or changing the chassis' main identifying information, such as changing a NIC. Removing the chassis from management causes the chassis to start the provisioning life cycle as if it were is a brand new chassis, if it is powered on again.
Once Remove the chassis is powered off, remove the chassis from Platform Server management permanently once the chassis is powered off:
Permanent removal
Code Block | ||
---|---|---|
| ||
platform release storagechassis -y <system-id> --remove |
Resetting to Defaults
If you would like Issue the following commands to clear out all existing setting customizations from a given chassis or the entire cluster, issue the following commands.
Info |
---|
NoteThese commands require a cluster reboot because the reset is not communicated to the Storage network dynamically. |
...
Code Block | ||
---|---|---|
| ||
platform delete allclusterconfig |
Managing Subclusters
After Assign chassis to subclusters after all the chassis have been are deployed and are running, assign chassis to subclusters.
To Use the list
command to see the current subcluster assignments, use the list
command:
List subclusters
Code Block | ||
---|---|---|
| ||
platform subcluster list |
...
Info |
---|
NoteReassignment is not immediate. Allow time for every node on the chassis to be migrated to the new subcluster. |
To Use the unassign
command to remove a chassis from a subcluster, use the unassign
command:
Remove from subcluster
Code Block | ||
---|---|---|
| ||
platform subcluster unassign -y <system-id> |
...
Changing the Default Gateway
By default, the The Platform Server configures Swarm Storage to use the Platform Server as its the default gateway by default.
To override this behavior, either Either add a "network.gateway" to the cluster configuration file or issue the following command to override this behavior:
Code Block | ||
---|---|---|
| ||
platform add config --name "network.gateway" --value "<ip-of-gateway>" |
...
With one exception, modifying the admin users for the Storage cluster requires the Storage cluster to be up and running before the operations can be done. The one exception to this is the "snmp" user which which can have its the password set while the cluster is down or before the cluster has been is booted for the first time.
...
Info |
---|
ImportantModifying passwords for the admin user requires you restarting the Service Proxy, if installed. It can also require updates to Gateway configuration. |
To Use the following CLI command to add a new admin user, use the following CLI command:
Add admin user
Code Block | ||
---|---|---|
| ||
platform add adminuser [--askpassword] [--username <username>] [--password <user password>] [--update] |
...
Upload the new version of the Swarm Storage software to Platform server, verifying the <version-name> matches the version of Swarm Storage being uploaded:
Code Block language bash platform upload storageimages -i <path-to-zip> -v <version-name> platform upload storageimages -i ./storage-9.6.0-x86_64.zip -v 9.6
Note: The zip file above is contained within theSwarm-{version}-{date}.zip
file. Inside this zip, a folder called Storage contains a file calledstorage-{version}-x86_64.zip
.Get a full listing of all of the nodes and their along with IPs, MAC addresses, and system IDs:
Code Block language bash platform list nodes --state Deployed
Using the list of system IDs, deploy the upgrade on each of the nodes. Run that command as well if restarting the node immediately after upgrade, :
Code Block language bash platform deploy storage --upgrade -v 9.2.1 -y <system-id> platform restart storagenode -y <system-id>
Restart the cluster now if each node is not restarted individually, either full or rolling:
Code Block language bash platform restart storagecluster --full or platform restart storagecluster --rolling [<options>]
Managing Service Proxy
Status — To Use this command to check the status of the Service Proxy, use this command:
Code Block | ||
---|---|---|
| ||
platform status proxy |
Upgrade — To upgrade the Service Proxy on the Platform server, use Use the CLI to upload the version and deploy it to upgrade the Service Proxy on the Platform server:
Code Block | ||
---|---|---|
| ||
platform deploy proxy -b <path-to-zip> --upgrade |
Info |
---|
NoteAfter a Service Proxy upgrade, it takes several minutes for the UI to come back up. |
Configuring DNS
You The Storage nodes may need to have the Storage nodes resolve names for outside resources, such as Elasticsearch or Syslog. To do so, configure Configure the DNS server on the Platform Server to communicate with outside domains to perform this.
Option 1:
...
Forwarding
A Slave/Backup DNS zone is a read-only copy of the DNS records; it can only receive receives updates from the Master zone of the DNS server.
If you have no DNS master/slave relationships configured, you can do simple Perform forwarding by having the domain managed by the Platform server forward all lookups to outside domainsoutside domains if no DNS master/slave relationships are configured:
Edit
/etc/bind/named.conf.options
and add the following line after the "listen-on-v6
" lineCode Block forwarders {172.30.0.202;};
Run the following command to restart bind9 on the Platform Server:
Code Block language bash sudo systemctl restart bind9
Option 2: Configuring a Slave DNS Zone
If you have an external DNS Zone configured, have Have the Platform Server become a slave DNS of that zone if an external DNS Zone is configured; the reverse can be done to allow other systems to resolve names for servers managed by the Platform server.
This process assumes that the external DNS server has been is configured to allow zone transfers to the Platform server. The DNS server on the Platform server is not configured to restrict zone transfers to other DNS slaves.
...
Configuring Docker Bridge
To Edit the file /etc/docker/daemon.json
to configure or modify the network information that is used by the default Docker (docker0) bridge, edit the file /etc/docker/daemon.json
. Add networking properties as properties to the root JSON object in the file:
...
The bip
property sets the IP address and subnet mask to use for the default docker0 bridge. For See the Docker documentation for details on the different properties, see the Docker documentation.