Currently, OnApp implements reblancing vDisk as delete/resync - it deletes the member being moved away from, sets membership to the new target member, then initiates resync to complete the task.
For default datastore with 2 replicas, this means the vDisk is degraded during the process with only 1 copy of the stripe being moved in the system. Loss of that single copy makes the vDisk unrecoverable completely - which could possibly happen if the physical drive being used to rebalance from dies in the middle of the process somewhere.
There should at least be an option to make rebalance copy/delete instead - initialize an extra node for that stripe, sync that node while maintaining the original two, on completion remove the old node from membership and delete the VM disk data there.
That would keep the VM disk from every actually being degraded at the cost of some performance - so it should be made an option ( possibly the default or make default configurable for the cloud / datastore )
Please sign in to leave a comment.