Best practices in configuring S Series Balancing

The S Series Balancing service balances object data across S Series Nodes in the same storage pool to spread the storage load.

When the S Series Balancing service runs, balancing is initiated if S Series Nodes in the same storage pool have a percent-used disparity greater than 10%. Balancing is relative to the size of each storage node, that is, percent capacity.

Best practices

To optimize balancing in a deployment with multiple attached S Series Nodes, if storage pools are sharing S Series Nodes, they should either share all their nodes with one another, or share none.

The practice of sharing all S Series Nodes allows data from any pool to be stored on any node, thereby maximizing the balancing algorithm and making use of all available storage capacity. Each pool must also have balancing enabled. If one pool does not enable balancing, balancing for the other pools might not complete.

Similarly, the practice of pools sharing none of their S Series Nodes ensures that each pool that participates in balancing can always move data.

The next figure shows a best-practice configuration where Pools A and B share all the same S Series Nodes, in the example, Nodes 1 and 2.

Because there is a greater than 10% disparity in the used capacity between Nodes 1 and 2 (Node 1 = 36% used, Node 2 = 18% used), the S Series Balancing service will have work to do. Each time the service runs, data will be moved from Node 1 to Node 2 until there is a less than 10% disparity in used capacity. For example, if Node 1 reaches 31% used capacity and Node 2 reaches 23% used capacity, the service will stop.

The next figure shows a best-practice configuration where Pools A and B each have their own set of S Series Nodes. There is no node sharing between the pools.

For Pool A, because there is a greater than 10% disparity in the used capacity between Nodes 1 and 2 (Note 1 = 36% used, Node 2 = 18% used), the S Series Balancing service will have work to do when it runs. For Pool B, because there is a less than 10% disparity in the used capacity between Nodes 3 and 4 (Node 3 = 36% used, Node 4 = 32% used), the S Series Balancing service will not have any work to perform.

Poor practices

There are some configuration scenarios that can prevent the S Series Balancing service from moving data, or can lead to less-than-optimal performance. When pools share some, but not all, of the same S Series Nodes, an object imbalance results across nodes. Data from any pool cannot be placed on any node, which prevents the S Series Balancing service algorithm from using all available capacity. In some instances, capacity on one S Series Node in a pool should be moved to another node in the pool, but there is insufficient data to move to balance the nodes.

The next figure shows a configuration where Pools A and B share Node 1 but not Node 2.

Because there is a greater than 10% disparity in the used capacity between Nodes 1 and 2 (Node 1= 41% used, Node 2 = 9% used), the S Series Balancing service has work to do for Pool B. When the service runs, data will be moved from Node 1 to Node 2. Data movement will occur each time the S Series Balancing service is scheduled to run, or when the service is initiated manually from the Overview page of the System Management Console.

However, only the data available to Pool B (in Bucket 2) will be moved. As mentioned earlier, the S Series Balancing service completes when there is a less than 10% disparity in the used capacity between the nodes that are participating in balancing. In this scenario, even after the 5 TB in Bucket 2 is moved from Node 1 to Node 2, there will still be a greater than 10% balance disparity. The S Series Balancing service will continue to report that it has work to do, but it will be unable to address the disparity.

To correct this poorly performing configuration, you could create another bucket on Node 2, and add the bucket to Pool A.

© 2015, 2020 Hitachi Vantara LLC. All rights reserved.