US 11,676,033 B1
Training machine learning models to be robust against label noise
Aditya Krishna Menon, New York, NY (US); Ankit Singh Rawat, New York, NY (US); Sashank Jakkam Reddi, Jersey City, NJ (US); and Sanjiv Kumar, Jericho, NY (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Mar. 6, 2020, as Appl. No. 16/812,160.
Int. Cl. G06N 3/04 (2023.01); G06N 3/084 (2023.01)
CPC G06N 3/084 (2013.01) [G06N 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of training a machine learning model having a plurality of model parameters and configured to receive a model input and to process the model input in accordance with the model parameters to generate a model output based on the model input, the method comprising:
obtaining a training input and a corresponding ground truth output;
processing the training input using the machine learning model and in accordance with current values of the model parameters to generate a training output based on the training input;
computing a loss for the training output by evaluating an objective function that measures a difference between the training output and the ground truth output, wherein the objective function is composed of a base loss and a link function; and
determining an update to current values of the model parameters, comprising:
determining, with respect to the model parameters, a first partial gradient of the loss with respect to the base loss and a second partial gradient of the loss with respect to the link function;
regularizing the first partial gradient of the loss to generate a regularized first partial gradient of the loss;
generating a recomposition of the regularized partial first gradient of the loss and the second partial gradient of the loss; and
computing the update from the generated recomposition.