CPC G06F 11/0712 (2013.01) [G06F 9/45558 (2013.01); G06F 11/079 (2013.01); G06F 11/1438 (2013.01); G06F 2009/45591 (2013.01); G06F 2201/815 (2013.01)] | 15 Claims |
1. A method for controlling a distributed operation system, comprising:
for a first container carrying a first process, determining a current fault type of a failure in the first container in response to detecting that the first process is triggered to terminate based on the failure in the first container;
reconstructing the first container and restarting the first process based on the first container reconstructed in response to determining that the current fault type is consistent with a target fault type;
wherein the target fault type is a fault type suitable for reconstruction of each container in the distributed operation system to which the first container belongs, and the fault type suitable for reconstruction of each container in the distributed operation system to which the first container belongs is a system failure that occurs in a failed container itself.
|