Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Most of the Swarm clusters in the world are storing increasing amounts of data and, at some point, the data footprint outgrows the It’s common for the data footprint of a cluster to grow overtime and eventually exceed its original cluster sizing. There are a variety of things to consider to make this a smooth process.

...

Cluster administrators should be monitoring the space usage over time and not delay in adding capacity. Remember that with COVID and other economic disruptiondisruptions, there can be are often delays in getting hardware even if after you are ready to cut a purchase todayorder. If your cluster is growing and over 80% full, it’s already time to start planning to add capacity.

...

In the background, Swarm will be re-balancing the cluster by moving objects from full volumes to less full ones. There are a couple of settings that control this behavior. But relocation is a necessary load to the cluster that can have some impact on client writes. In an ideal case, the cluster should have free space on all volumes so that there are many places to quickly write objects during the inevitable recoveries that happen when disks go bad. This is also true for new writes. Having options lots of disks with space enables better load balancing and better performance. If cluster performance is important to your business, proactively adding space allows for a longer re-balancing window with less performance impact.

A more advanced involved technique can be used if all volumes in the cluster are getting full and by the time new nodes are added. The idea is to “sprinkle” empty volumes throughout all the nodes in the cluster so that when the operation is complete, “old” and “new” nodes have a mixture of “old” and “new” disks, meaning more full disks and also disks with plenty of available capacity. This option isn’t commonly done because it involves a lot of hands-on work to the cluster during the upgrade.

...