| CPC G06F 11/1484 (2013.01) [G06F 9/45558 (2013.01); G06F 11/079 (2013.01); G06F 11/1438 (2013.01); G06F 2009/45575 (2013.01); G06F 2009/45579 (2013.01); G06F 2009/45591 (2013.01); G06F 2009/45595 (2013.01)] | 18 Claims |

|
1. A cluster system, comprising:
a first storage node and a second storage node arranged in an active-active storage system configuration,
wherein the first storage node is configured as a first virtual machine (VM) executing on at least one hypervisor host, and
wherein the second storage node is configured as a second VM executing on the at least one hypervisor host; and
a plurality of storage drives communicably coupled to the first VM and the second VM,
wherein the first VM is configured to:
engage in communications with the second VM through a plurality of communication mechanisms including the plurality of storage drives and multiple network or channel connections;
determine that the second VM is malfunctioning based at least on a response or a lack of response to the communications;
enforce one or more actions or processes to initiate self-fencing of the second VM determined to be malfunctioning;
obtain, from the at least one hypervisor host, an identifier (ID) of the second VM determined to be malfunctioning; and
force a kernel panic at the second VM having the obtained ID by issuing a non-maskable interrupt (NMI) on the at least one hypervisor host.
|