US 12,353,969 B2
System and method for examining data from a source
Yunyu Kuang, Waterloo (CA); Chun Yau Hao, Oakville (CA); Hugo Adolfo Gonçalves Cibrão, Toronto (CA); and Nicholas Victor Laurence Morin, Toronto (CA)
Assigned to The Toronto-Dominion Bank, Toronto (CA)
Filed by The Toronto-Dominion Bank, Toronto (CA)
Filed on Nov. 1, 2023, as Appl. No. 18/499,556.
Application 18/499,556 is a continuation of application No. 16/455,100, filed on Jun. 27, 2019, granted, now 11,842,252.
Prior Publication US 2024/0062117 A1, Feb. 22, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/18 (2006.01); G06F 16/2458 (2019.01); G06F 16/248 (2019.01); G06N 20/00 (2019.01)
CPC G06N 20/00 (2019.01) [G06F 16/2462 (2019.01); G06F 16/248 (2019.01); G06F 17/18 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A device for examining data from a source used in downstream processes, the device comprising:
a processor;
a data interface coupled to the processor; and
a memory coupled to the processor, the memory storing computer executable instructions that when executed by the processor cause the processor to:
obtain access to a plurality of statistical models, each of the plurality of statistical models having been generated based on a set of historical data for a first period of time and a forecast for each model by:
training a first model with data from a second period of time of the set of historical data, the second period being a subset of the first period;
forecasting data for a third period of time, subsequent to the second period, and a subset of the first period, and comparing the forecasted data for the third period with the corresponding set of historical data; and
repeating the training and forecasting for each of the plurality of statistical models;
select one of the plurality of statistical models by evaluating the models based on how good each model would have been at forecasting current data during the period of time associated with the forecast, by comparing data generated by each model to a set of current data for a fourth period of time, to determine which of the models generates a forecast whose values are closest to the set of current data being analyzed, the fourth period of time subsequent to the first period of time;
generate a new forecast using the selected model, the new forecast predicting occurrences during the fourth period of time;
compare the set of current data against the predicted occurrences in the new forecast to identify any data points in the set of current data with unexpected values; and
responsive to detecting one or more data points with unexpected values, interrupt or stop the downstream process that uses the current data, to enable an investigation to be conducted.