CPC G06F 9/45558 (2013.01) [G06N 7/01 (2023.01); G06F 2009/4557 (2013.01); G06F 2009/45583 (2013.01)] | 11 Claims |
1. A method for handling a memory failure, comprising:
in response to detecting a failure occurring in memory of a host machine, acquiring a failure parameter of the memory;
determining a crash probability of the host machine based on the failure parameter; and
transferring all virtual machines on the host machine to a target host machine when the crash probability is greater than or equal to a first predetermined threshold, wherein a crash probability of the target host machine is less than a second predetermined threshold, the second predetermined threshold is less than the first predetermined threshold;
wherein the method further comprises:
acquiring a first control instruction sent by a kernel system;
writing information into a target position of a target memory page of the host machine based on the first control instruction;
generating a first code corresponding to the target position of the target memory page based on the written information;
acquiring a second control instruction sent by a kernel system;
reading information out from the target position of the target memory page of the host machine based on the second control instruction;
generating a second code corresponding to the target position of the target memory page based on the read-out information; and
determining that the failure occurs in the target memory page when the first code is different from the second code;
wherein the acquiring the failure parameter of the memory comprises:
parsing the first code and the second code based on a predetermined algorithm;
acquiring difference codes between the first code and the second code after the parsing;
determining one or more incorrect bits corresponding to the target position of the target memory page based on the difference codes; and
determining a number of the one or more incorrect bits and position features of the one or more incorrect bits based on the one or more incorrect bits;
the method further comprising:
marking the memory when the crash probability of the host machine is less than the first predetermined threshold and greater than or equal to the second predetermined threshold; and
determining target virtual machines based on the crash probability of the host machine and a number of all the virtual machines on the host machine and transferring the target virtual machines, wherein a number of the target virtual machines is less than the number of all the virtual machines.
|