US 12,406,026 B2
Abnormal log event detection and prediction
Yi Ming Wang, Xian (CN); Hui Dong, Xian (CN); Zhong Fang Yuan, Xian (CN); Tong Liu, Xian (CN); Yan Fen Liu, Tianjin (CN); and Ling Chen, Beijing (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jun. 30, 2021, as Appl. No. 17/363,147.
Prior Publication US 2023/0004750 A1, Jan. 5, 2023
Int. Cl. G06F 18/23213 (2023.01); G06F 16/18 (2019.01); G06F 16/9032 (2019.01); G06F 16/906 (2019.01); G06F 18/214 (2023.01)
CPC G06F 18/23213 (2023.01) [G06F 16/1805 (2019.01); G06F 16/90332 (2019.01); G06F 16/906 (2019.01); G06F 18/214 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving a current log line of a log file indicative of a current log event reflecting a portion of processing performed by one or more computer(s) in an information technology (IT) system;
categorizing the current log event as belonging to a first event cluster of a plurality of event clusters with the first event cluster being a central processing unit (CPU) normal type cluster;
determining a probability value that the first event cluster will be followed by event(s) from a second event cluster of the plurality of event clusters, with the second event cluster being a CPU abnormal type cluster;
responsive to determining the probability value that the first event cluster will be followed by the event(s) from the second event cluster, determining that the probability value exceeds a first threshold;
responsive to determining that the probability value exceeds the first threshold, predicting, by machine logic, a predicted time of event transition from a time of the current log event until event(s) of the second event cluster are likely to occur;
responsive to predicting the predicted time of the event transition, determining that the predicted time of the event transition from the time of the current log event until event(s) of the second event cluster are likely to occur is below a second threshold; and
responsive to the determination that the probability value that the first event cluster will be followed by the event(s) from the second event cluster exceeds the first threshold and further responsive to the determination that the predicted time of the event transition is below the second threshold, communicating a warning that abnormal CPU operations are likely to occur so that countermeasures can be taken so that the CPU remains operational, wherein predicting the predicted time of the event transition comprises:
obtaining, by one or more processors, a probability of the event transition from the current event cluster to each of the plurality of event clusters to form a plurality of obtained event transitions;
determining, by the one or more processors, an obtained event transition having a highest probability from the plurality of obtained event transitions; and
in response to the obtained event transition having the highest probability being directed to at least one abnormal event cluster, calculating, by the one or more processors, a mean time of the obtained event transition having the highest probability in an operational history of the IT system as the predicted time.