US 12,423,573 B2
	Neural network optimizer search
Irwan Bello, San Francisco, CA (US); Barret Zoph, Sunnyvale, CA (US); Vijay Vasudevan, Los Altos Hills, CA (US); and Quoc V. Le, Sunnyvale, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Jan. 11, 2021, as Appl. No. 17/145,524.
Application 17/145,524 is a continuation of application No. 16/662,924, filed on Oct. 24, 2019, granted, now 10,922,611.
Application 16/662,924 is a continuation of application No. PCT/US2018/030281, filed on Apr. 30, 2018.
Claims priority of provisional application 62/492,021, filed on Apr. 28, 2017.
Prior Publication US 2021/0271970 A1, Sep. 2, 2021
Int. Cl. G06N 3/08 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/044 (2023.01); G06N 3/045 (2023.01)]

16 Claims

1. A method for training a particular neural network to perform a particular neural network task, the method comprising repeatedly performing:

determining a gradient with respect to parameters of the particular neural network;

applying, to at least the gradient with respect to the parameters of the particular neural network, an update rule to generate an update to values of the parameters of the particular neural network; and

applying the update to the values of the parameters of the particular neural network,

wherein the update rule has been generated by performing the following operations:

determining the update rule using a controller neural network that is different from the particular neural network, comprising:

generating, using the controller neural network having a plurality of controller parameters and in accordance with current values of the plurality of controller parameters, a plurality of output sequences, each generated output sequence defining a respective candidate update rule and including a respective character selected at each of a plurality of time steps, wherein the respective character at each time step is selected from a set of possible characters for the time step, and

wherein the controller neural network is configured to, for a given output sequence and at each time step, receive as input the character at a preceding step and process the character through an embedding neural network layer, one or more intermediate neural network layers, and one or more output layers to generate an output for the time step that defines a score distribution over possible characters for the time step,

for each generated output sequence:

training a respective instance of a child neural network to perform a neural network task by repeatedly performing the following:

determining, through backpropagation, a gradient with respect to parameters of the instance of the child neural network,

applying, to at least the gradient with respect to the parameters of the instance of the child neural network, the candidate update rule defined by the output sequence generated by the controller neural network to generate an update to values of the parameters of the instance of the child neural network, and

applying the update to the values of the parameters of the instance of the child neural network, and

evaluating a performance of the trained instance of the child neural network on the particular neural network task to determine a performance metric for the trained instance of the child neural network on the particular neural network task;

using the performance metrics for the trained instances of the child neural network to adjust the current values of the plurality of controller parameters of the controller neural network; and

generating, using the controller neural network in accordance with the adjusted values of the plurality of controller parameters, a final output sequence that defines the update rule for updating the values of parameters of the particular neural network.