US 11,892,900 B2
Root cause analysis of non-deterministic performance anomalies
Girish Nadger, Palo Alto, CA (US); Somenath Pal, Bangalore (IN); and Somaresh Sahu, Bangalore (IN)
Assigned to VMware LLC, Palo Alto, CA (US)
Filed by VMware LLC, Palo Alto, CA (US)
Filed on Oct. 11, 2019, as Appl. No. 16/599,134.
Claims priority of application No. 201941029688 (IN), filed on Jul. 23, 2019.
Prior Publication US 2021/0026723 A1, Jan. 28, 2021
Int. Cl. G06F 11/07 (2006.01)
CPC G06F 11/079 (2013.01) [G06F 11/0709 (2013.01); G06F 11/0751 (2013.01); G06F 11/0778 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for performing root cause analysis to identify performance degradation in a network comprising a plurality of components, the method comprising:
at a set of one or more servers:
collecting, through the network, data regarding operation of a set of components, the collected data comprising deterministic data that provides a definitive value regarding operational states of a first subset components, and non-deterministic data that provides non-definite values regarding operational state of a second subset of components;
performing a first analysis on the collected data to filter out data that is not relevant for the root cause analysis and to store data that remains after the filtering operation in a storage;
retrieving data from the storage to perform a second analysis on the collected, filtered data to identify an instance in time when one or more components, while still operational, are potentially suffering from performance degradation;
generating, from the collected, filtered data, a digital signature representing an operational performance of the set of components at the identified instance in time, said generated signature comprising deterministic and non-deterministic symptom values each of which is associated with a component in the set of components, each deterministic value specifying whether its associated component is operational or has failed, and each non-deterministic symptom value specifying whether its associated component is operating normally or anomalously; and
performing a third analysis that compares the generated signature with a plurality of pre-tabulated signatures each of which is associated with at least one particular root cause for performance degradation of one component, in order to identify a root cause of a performance degradation of at least one component in the network,
wherein at least two of the collecting, performing the first analysis, the retrieving, generating the digital signature, and performing the third analysis are performed in parallel.