CPC G06F 17/18 (2013.01) [G06N 5/04 (2013.01); G06N 7/01 (2023.01)] | 22 Claims |
1. A computer-implemented method for determining whether a data element, having a value, of a time-series dataset is an outlier, the method comprising:
obtaining prediction data, for predicting a value of the data element, from first data of the time-series dataset that temporally precedes the data element;
predicting, using the prediction data, a predicted value of the data element;
obtaining an error value for the data element representative of a difference between the value and the predicted value of the data element
obtaining historic error values for the time-series dataset, each historic error value being representative of a difference between a value and a predicted value of a second data element of the time-series dataset that temporally precedes the data element;
obtaining, based on one or more of the historic error values, a threshold value for the error value of the data element defining error values for the data element that are considered to be outliers, wherein obtaining the threshold value comprises:
determining a predetermined number based on a percentage of error values expected to be outliers;
multiplying a statistical measure of the historic error values by the predetermined number to produce a result, wherein the statistical measure includes one of a mean, median, mode, and standard deviation; and
determining the threshold value based on the result, wherein a different threshold value is determined for each data element of the time-series dataset; and
determining whether the data element is an outlier based on a comparison of the threshold value with the error value of the data element.
|