US 12,340,286 B2
	Model management system for improving training data through machine learning deployment
Daniel Bibireata, Bellevue, WA (US); Andrew Yan-Tak Ng, Vancouver, WA (US); Pingyang He, Palo Alto, CA (US); Zeqi Qiu, Mountain View, CA (US); Camilo Iral, Guarne (CO); Mingrui Zhang, Beijing (CN); Aldrin Leal, Envigado (CO); Junjie Guan, Redmond, WA (US); Ramesh Sampath, Fremont, CA (US); Dillon Laird, San Francisco, CA (US); Yu Qing Zhou, San Francisco, CA (US); Juan Camilo Fernancez, Medellin (CO); Camilo Zapata, Medellin (CO); Sebastian Rodriguez, Medellin (CO); Cristobal Silva, Medellin (CO); Sanjay Bodhu, Aurora, IL (US); Mark William Sabini, River Edge, NJ (US); Leela Seshu Reddy Cheedepudi, Milpitas, CA (US); Kai Yang, Fremont, CA (US); Yan Liu, Palo Alto, CA (US); Whit Blodgett, San Francisco, CA (US); Ankur Rawat, Bothell, WA (US); Francisco Matias Cuenca-Acuna, Cordoba (AR); and Quinn Killough, Sonoma, CA (US)
Assigned to LandingAI Inc., Palo Alto, CA (US)
Filed by Landing AI Inc., Palo Alto, CA (US)
Filed on Sep. 9, 2021, as Appl. No. 17/470,368.
Claims priority of provisional application 63/195,698, filed on Jun. 1, 2021.
Claims priority of provisional application 63/163,368, filed on Mar. 19, 2021.
Prior Publication US 2022/0300855 A1, Sep. 22, 2022
Int. Cl. G06N 3/08 (2023.01); G06N 20/00 (2019.01)

CPC G06N 20/00 (2019.01) [G06N 3/08 (2013.01)]

20 Claims

1. A method for refining a machine learning model comprising:

receiving a set of outputs from a deployment of the machine learning model, wherein the set of outputs is generated by the deployment using a set of trained parameters associated with the machine learning model, the machine learning model trained with a training dataset comprising a plurality of training data points, the set of outputs comprising predictions from the machine learning model based on new inputs to the machine learning model;

determining, based on the set of outputs, that a particular training data point of the plurality of training data points is a noisy data point for which the trained model performs inadequately, the noisy data point corresponding to a particular training example within the training data set that was used to train the machine learning model;

responsive to identification of the noisy data point, determining a cause of failure based on a mapping of the noisy data point into a multi-dimensional space, which represents a distribution generated based on one or more attributes associated with the training dataset;

generating a refined training dataset by conducting a refinement towards the training dataset;

retraining the machine learning model with the refined training dataset, the retraining generating a set of updated trained parameters; and

sending the set of updated trained parameters to a user.