US 12,216,527 B1
System and method for data ingestion, anomaly and root cause detection
Abraham Starosta, Miami, FL (US); Francis Beckert, Mountain View, CA (US); and Chandrima Sarkar, Dublin, CA (US)
Assigned to Splunk Inc., San Jose, CA (US)
Filed by Splunk, Inc., San Francisco, CA (US)
Filed on Jan. 24, 2022, as Appl. No. 17/583,056.
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 11/00 (2006.01); G06F 11/07 (2006.01); G06F 16/23 (2019.01); G06N 20/00 (2019.01); G06F 11/30 (2006.01); G06F 11/34 (2006.01)
CPC G06F 11/0751 (2013.01) [G06F 11/0781 (2013.01); G06F 16/2365 (2019.01); G06N 20/00 (2019.01); G06F 11/076 (2013.01); G06F 11/079 (2013.01); G06F 11/0793 (2013.01); G06F 11/3058 (2013.01); G06F 11/3447 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A computerized method comprising:
detecting a data ingestion anomaly through deployment of a first machine learning model taking a data ingestion volume as input, wherein detection of the data ingestion anomaly indicates that the data ingestion volume is an anomalous data ingestion volume;
determining a cause for the data ingestion anomaly by at least:
(i) determining features of the anomalous data ingestion volume,
(ii) training a second machine learning model with historical data sets consistent with the determined features of the anomalous data ingestion volume,
(iii) deploying a second machine learning model taking a data ingestion sub-volume as input, wherein the second machine learning model is configured to predict whether the data ingestion sub-volume is anomalous, wherein the data ingestion sub-volume is a portion of the anomalous data ingestion volume,
(iv) obtaining system state information during ingestion of the anomalous data ingestion sub-volume, and
(v) determining the cause of the anomalous data ingestion volume based on the system state information; and
generating a graphical user interface (GUI) that illustrates a graphical representation of the data ingestion volume over a prescribed time period and a visual representation of an error threshold relative to a predicted data ingestion volume over the prescribed time period, wherein the data ingestion anomaly is displayed as a visually distinct visual element from the graphical representation of the data ingestion volume and is displayed outside of the visual representation of the error threshold, and wherein the GUI displays a remedial recommendation for resolving the cause of the anomalous data ingestion value including an indication to fix or replace a component that is inoperable.