DxCLI rescan-disk Timeout Caused By Cloud-Init
Summary
Linux systems using DM multipath may run into a known cloud-init problem that causes the dxcli rescan-disk
command to timeout on one or more nodes.
Command failed: <node_name>: Command failed on <X> of <Y> targets
Information
These command errors will also be accompanied by the following error in journalctl:
DxStorMonitor: I/O timeout Invalidating cached metadata from disk
The root of the issue is user-level processes waiting a long time on udev to respond during partition table rescan events.
Resolution
Removing /lib/udev/rules.d/10-cloud-init-hook-hotplug.rules can be used to workaround the problem.
For more detailed information about this issue, please see the following Cloud-Init bug report: