US 12,298,839 B2
Method for controlling distributed operation system, device, and medium
Shuaijian Wang, Beijing (CN); Shiyong Li, Beijing (CN); Henghua Zhang, Beijing (CN); Panpan Li, Beijing (CN); Zaibin Hu, Beijing (CN); and Baotong Luo, Beijing (CN)
Assigned to BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
Appl. No. 18/041,035
Filed by BEIJING BAIDU NETCOM SCIENCE TECHNOLOGY CO., LTD., Beijing (CN)
PCT Filed Jun. 7, 2022, PCT No. PCT/CN2022/097438
§ 371(c)(1), (2) Date Feb. 8, 2023,
PCT Pub. No. WO2023/115836, PCT Pub. Date Jun. 29, 2023.
Claims priority of application No. 202111602689.1 (CN), filed on Dec. 24, 2021.
Prior Publication US 2024/0303142 A1, Sep. 12, 2024
Int. Cl. G06F 11/07 (2006.01); G06F 9/455 (2018.01); G06F 11/14 (2006.01)
CPC G06F 11/0712 (2013.01) [G06F 9/45558 (2013.01); G06F 11/079 (2013.01); G06F 11/1438 (2013.01); G06F 2009/45591 (2013.01); G06F 2201/815 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method for controlling a distributed operation system, comprising:
for a first container carrying a first process, determining a current fault type of a failure in the first container in response to detecting that the first process is triggered to terminate based on the failure in the first container;
reconstructing the first container and restarting the first process based on the first container reconstructed in response to determining that the current fault type is consistent with a target fault type;
wherein the target fault type is a fault type suitable for reconstruction of each container in the distributed operation system to which the first container belongs, and the fault type suitable for reconstruction of each container in the distributed operation system to which the first container belongs is a system failure that occurs in a failed container itself.