Version: v23.0

DxCLI rescan-disk Timeout Caused By Cloud-Init

Summary

Linux systems using DM multipath may run into a known cloud-init problem that causes the dxcli rescan-disk command to timeout on one or more nodes.

Command failed: <node_name>: Command failed on <X> of <Y> targets

These command errors will also be accompanied by the following error in journalctl:

DxStorMonitor: I/O timeout Invalidating cached metadata from disk

The root of the issue is user-level processes waiting a long time on udev to respond during partition table rescan events.

Removing /lib/udev/rules.d/10-cloud-init-hook-hotplug.rules can be used to workaround the problem.

For more detailed information about this issue, please see the following Cloud-Init bug report: