US 12,147,878 B2
	Feedback-based training for anomaly detection
Barath Balasubramanian, Bothell, WA (US); Rahul Bhotika, Bellevue, WA (US); Niels Brouwers, New York, NY (US); Ranju Das, Seattle, WA (US); Prakash Krishnan, Oakland, NJ (US); Shaun Ryan James McDowell, Great Neck, NY (US); Anushri Mainthia, Seattle, WA (US); Rakesh Madhavan Nambiar, Seattle, WA (US); Anant Patel, Seattle, WA (US); Avinash Aghoram Ravichandran, Shoreline, WA (US); Joaquin Zepeda Salvatierra, Mercer Island, WA (US); and Gurumurthy Swaminathan, Redmond, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Nov. 27, 2020, as Appl. No. 17/106,026.
Prior Publication US 2022/0172100 A1, Jun. 2, 2022
Int. Cl. G06N 3/091 (2023.01); G06F 16/23 (2019.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01); G06N 3/088 (2023.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01); G06T 7/00 (2017.01)

CPC G06N 20/00 (2019.01) [G06F 16/2379 (2019.01); G06F 18/214 (2023.01); G06F 18/2178 (2023.01); G06N 3/088 (2013.01); G06N 3/091 (2023.01); G06N 5/04 (2013.01); G06T 7/0004 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/30108 (2013.01)]

20 Claims

1. A computer-implemented method comprising:

receiving a request to perform feedback-based retraining, the request including one or more of an identifier of one or more models to retrain, an identifier of a dataset to use for retraining, an identifier of a dataset to use for testing, an indication of a threshold for an anomaly, an indication of how to display items to verify, and an indication of where to store historical information;

training a plurality of anomaly detection machine learning models using a training dataset that is at least partially annotated to generate a trained plurality of anomaly detection machine learning models;

selecting an anomaly detection machine learning model of the trained plurality of anomaly detection machine learning models based at least in part on a test metric;

applying the anomaly detection machine learning model on an unlabeled dataset to generate, per dataset item of the unlabeled dataset, a prediction and an importance ranking score for the prediction, wherein the importance ranking score is based on a probability that the dataset item was classified correctly;

selecting, based on the importance ranking scores, a result of the application of the scoring machine learning model on the unlabeled dataset;

providing the result and requesting feedback on the result;

receiving the feedback;

adding data from the unlabeled dataset into the training dataset when the feedback indicates a verified result;

retraining the anomaly detection machine learning model using the training dataset with the data added from the unlabeled dataset to generate a retrained anomaly detection machine learning model; and

deploying the retrained anomaly detection machine learning model to perform inferences on unlabeled images.