There is a recent regression, which will be resolved in an upcoming release, that causes NBD connections to be opened with the non optimal cfq scheduler, which has been known in some circumstances to cause NBD timeouts and, in turn, degraded disks.
We can check the status of the existing devices on each HV:
This is correct:
noop anticipatory [deadline] cfq
This is wrong:
noop anticipatory deadline [cfq]
We can set all NBD devices on a HV to use the deadline scheduler with the following, which can be done on the fly:
for d in /sys/block/nbd*/queue/scheduler; do echo deadline > $d ; done
for d in /sys/block/sd[a-z]/queue/scheduler; do echo deadline > $d ; done
Note: We would suggest to apply this change as a precaution even if you are not currently seeing any of the issues described above.