Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Cross reference the slot number determined above with the drive enclosure specification of the storage node. This will allow you to identify the drive bay number for the failed drive in the enclosure / chassis.
    For the example below, we use the disk slot designations on a Huawei 5288 V5 server enclosure:
    Slot of front disks

    Image RemovedImage Added

    Slot of rear disks

    Image RemovedImage Added
  2. With the physical location & drive determined, you can now replace it. First, suspend volume recovery in the storage cluster from either Swarm UIS or using the ‘swarmctl’ CLI tool (we use swarmctl below):

    Code Block
    /root/dist/swarmctl -d [node_ip] -C recovery.suspend -V true -p admin:[password]

  3. Next, remove the failed disk from the chassis according the slot number identified from previous section. You can verify the serial number on the drive vs. that displayed in Swarm UIS confirm the correct drive has been removed. If the the serial numbers don’t match, simply insert the disk back into the enclosure.

  4. With the failed drive removed, insert your replacement drive into the empty slot.

  5. You can now re-enable volume recovery in the cluster by turning off volume recovery suspension:

    Code Block
    /root/dist/swarmctl -d [node_ip] -C recovery.suspend -V false -p admin:[password]
  6. Verify the new drive appears in Swarm UIS. After a few minutes have passed, it should show it has a non-zero stream count (which means it’s actively taking data).

...