US 12,229,114 B2
Data anomaly detection
Vinay Dhingra, Gurugram (IN); Agraj Gupta, New Delhi (IN); Ashank Gupta, New Delhi (IN); Vaibhav Gupta, New Delhi (IN); Anam Hyderi, Udaipur (IN); Sandeep Pattanayak, Jodhpur (IN); Purvi Shah, New Brunswick, NJ (US); and Shikha, New Delhi (IN)
Assigned to AMERICAN EXPRESS TRAVEL RELATED SERVICES COMPANY, INC., New York, NY (US)
Filed by American Express Travel Related Services Company, Inc., New York, NY (US); and American Express (India) Private Limited, Gurgaon (IN)
Filed on Sep. 27, 2022, as Appl. No. 17/953,390.
Prior Publication US 2024/0104083 A1, Mar. 28, 2024
Int. Cl. G06F 16/00 (2019.01); G06F 16/23 (2019.01)
CPC G06F 16/2365 (2019.01) 20 Claims
OG exemplary drawing
 
1. A system, comprising:
a computing device comprising a processor and a memory;
a profiling service stored in the memory that, when executed by the processor, causes the computing device to at least:
generate a variable profile for each variable in source data, each variable representing a column in the source data and the variable profile comprising one or more of a mean, median, mode, minimum, or maximum value for the column in the source data; and
provide the variable profile for each variable to each of a plurality of machine learning models;
wherein the plurality of machine learning models are stored in the memory and each of the plurality of machine learning models, when executed by the processor, causes the computing device to at least:
determine whether each variable profile is anomalous; and
provide a determination whether each variable profile is anomalous to an ensemble model; and
wherein the ensemble model is stored in the memory and, when executed by the processor, causes the computing device to at least:
analyze each variable profile using one or more ensemble machine-learning techniques; and
generate a final determination whether each variable profile is anomalous based at least in part on the determination received from each of the plurality of machine learning models; and
report the final determination for each variable profile to an analysis service, wherein the analysis service is further stored in the memory of the computing device and, when executed by the processor, further causes the computing device to at least:
receive, from a client device, an indication of the accuracy of the final determination; and
update the ensemble model based at least in part on the indication of the accuracy of the final determination.