US 11,928,584 B2
Distributed hyperparameter tuning and load balancing for mathematical models
Bradford William Powley, Palo Alto, CA (US); Noah Burbank, Palo Alto, CA (US); and Rowan Cassius, San Francisco, CA (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by salesforce.com, inc., San Francisco, CA (US)
Filed on Jan. 31, 2020, as Appl. No. 16/778,587.
Prior Publication US 2021/0241164 A1, Aug. 5, 2021
Int. Cl. G06N 3/08 (2023.01); H04L 67/1001 (2022.01)
CPC G06N 3/08 (2013.01) [H04L 67/1001 (2022.05)] 20 Claims
OG exemplary drawing
 
1. A method for training a mathematical model associated with a plurality of hyperparameter values via a serial process, comprising:
generating a first plurality of combinations of hyperparameter values from the plurality of hyperparameter values associated with training the mathematical model;
identifying a subset of combinations of hyperparameter values from the first plurality of combinations of hyperparameter values, wherein one or more combinations of hyperparameter values of the subset of combinations of hyperparameter values are associated with a computational runtime that exceeds a first threshold that is a computational runtime threshold associated with a computational runtime value;
distributing combinations of hyperparameter values from the subset of combinations of hyperparameter values to be executed on a plurality of machines such that each machine of the plurality of machines is assigned a number combinations of hyperparameter values from of the subset of combinations of hyperparameter values that is less than a second threshold, wherein the second threshold is a threshold number of combinations of hyperparameter values from the subset of combinations of hyperparameter values;
testing each of the first plurality of combinations of hyperparameter values against the mathematical model using the plurality of machines in a parallel processing operation to generate a first plurality of validation error values, one or more validation error values of the first plurality of validation error values corresponding to the testing of one or more combinations of hyperparameter values of the first plurality of combinations of hyperparameter values against the mathematical model, wherein the parallel processing operation is operated in accordance with the distributed combinations of hyperparameter values from the subset of combinations of hyperparameter values, and wherein the testing of the first plurality of combinations of hyperparameter values against the mathematical model is a first part of the serial process; and
testing a second plurality of combinations of hyperparameter values against the mathematical model using an objective function, an input of the objective function being based at least in part on the first plurality of validation error values and the objective function generating a result value including combinations of hyperparameter values for training the mathematical model based at least in part on the input of the objective function, wherein the testing of the second plurality of combinations of hyperparameter values against the mathematical model is a second part of the serial process and is based at least in part on the first part of the serial process, wherein the result value is a result of the serial process such that the combinations of hyperparameter values included in the result value are associated with validation error values that are less than a validation error value threshold, and wherein the result value is generated is based at least in part on the combinations of hyperparameter values from the subset of combinations of hyperparameter values being distributed on the plurality of machines.