Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Most customers start with their nodes fully populated, but if you can afford extra server capacity earlier, leaving disk slots empty is just fine. When there’s a need for more space, new disk can just be hot plug added to a node without down time. Of course, any failed or retired drive should also be swapped for empty disks over time and these operations don’t require reboot.

Adding Nodes

More commonly, adding Adding one or more servers at a time is the option most commonly used to add capacity. Adding nodes to a cluster is relatively easy, so I won’t dwell on those steps here. Instead, let’s focus on what happens when a new empty node is added to a cluster.

...

In the background, Swarm will be re-balancing the cluster by moving objects from full volumes to less full ones. There are a couple of settings that control this behavior. But relocation is a necessary load to the cluster that can have some impact on client writes. In a an ideal case, the cluster should have free space on all volumes so that there are many places to quickly write objects during the inevitable recoveries that happen when disks go bad. This is also true for new writes. Having options enables better load balancing and better performance. If cluster performance is important to your business, proactively adding space allows for a longer re-balancing window with less performance impact.

A more advanced technique can be used if all volumes in the cluster are getting full and new nodes are added. The idea is to “sprinkle” empty volumes throughout all the nodes in the cluster so that when the operation is complete, “old” and “new” nodes have a mixture of “old” and “new” disks, meaning more full disks and also disks with plenty of available capacity. This option isn’t commonly done because it involves a lot of hands-on work to the cluster during the upgrade.

...

At the end of this process, there will still be re-balancing work but it will be more evenly spread throughout the cluster and there will be minimal replica movement . needed to protect the existing data.