US 11,900,222 B1
	Efficient machine learning model architecture selection
Jyrki A. Alakuijala, Wollerau (CH); Quentin Lascombes de Laroussilhe, Zurichf (CH); Andrey Khorlin, Zurich (CH); Jeremiah Joseph Harmsen, Zurich (CH); and Andrea Gesmundo, Zurich (CH)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Mar. 15, 2019, as Appl. No. 16/355,185.
Int. Cl. G06N 20/00 (2019.01); G06N 3/08 (2023.01)

CPC G06N 20/00 (2019.01) [G06N 3/08 (2013.01)]

16 Claims

1. A method performed by one or more data processing apparatus, the method comprising:

receiving a request to train a machine learning model to perform a machine learning task on a set of training examples, wherein each training example comprises a training input and a corresponding target output, and the corresponding target output represents an output that should be generated by processing the training input using the machine learning model;

determining a set of one or more meta-data values characterizing attributes of the set of training examples, the attributes comprising a complexity of the training inputs, the complexity of the training inputs based on a number of principal components required to reconstruct the training inputs with a predetermined level of accuracy;

receiving a predetermined set of machine learning model architectures, each machine learning model architecture in the predetermined set of machine learning model architectures represented by a respective function parameterized by the complexity of the training inputs such that the respective parameterized function has one or more parameters corresponding to the complexity of the training inputs;

determining, using a mapping function mapping each of the set of one or more meta-data values characterizing the attributes of the set of training examples to one or more of the machine learning model architectures of the predetermined set of machine learning model architectures based on the respective parameterized function of each of the one or more of the machine learning model architectures, a particular machine learning model architecture by mapping each of the set of one or more meta-data values characterizing the attributes of the set of training examples to a random forest architecture, wherein a maximum depth of a decision tree in the random forest architecture is parameterized based on the complexity of the training inputs;

selecting, using the particular machine learning model architecture, a final machine learning model architecture for performing the machine learning task; and

training a machine learning model having the final machine learning model architecture on the set of training examples.