US 12,298,840 B1
Skip learning for multivariate anomaly detection on streaming data
Rajeev Rai Bhatia, Lynnwood, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Nov. 21, 2022, as Appl. No. 17/991,726.
Int. Cl. G06F 11/00 (2006.01); G06F 11/07 (2006.01); G06N 20/20 (2019.01)
CPC G06F 11/0751 (2013.01) [G06F 11/0721 (2013.01); G06N 20/20 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method for detecting anomalies in time series data, comprising:
obtaining time series data spanning a moving time period to be used as training data, the time series data comprising values from at least one sensor or data source generated periodically for the moving time period, wherein the moving time period comprises at least a first time window associated with a first portion of time series data and a second time window associated with a second portion of time series data, wherein the first time window and the second time window individually comprise a length of time;
upon obtaining a third portion of time series data spanning the length of time and associated with a third time window, adding the third portion of time series data associated with the third time window to the training data and removing the first portion of time series data associated with the first time window corresponding to a beginning of the moving time period;
training an isolation forest machine learning model using the training data to identify anomalies in the time series data, wherein training the isolation forest machine learning model comprises:
detecting an anomaly in the time series data corresponding to a first time period;
pausing training the isolation forest machine learning model by not entering additional values corresponding to a number of periods after the first time period into the isolation forest machine learning model; and
resuming training the isolation forest machine learning model by entering additional values corresponding to at least one period after the number of periods has elapsed; and
obtaining a fourth portion of time series data spanning the length of time and associated with a fourth time window; and
using the trained isolation forest machine learning model to detect anomalies in the fourth portion of time series data.