US 12,271,823 B2
	Training machine learning models by determining update rules using neural networks
Misha Man Ray Denil, London (GB); Tom Schaul, London (GB); Marcin Andrychowicz, London (GB); Joao Ferdinando Gomes de Freitas, London (GB); Sergio Gomez Colmenarejo, London (GB); Matthew William Hoffman, London (GB); and David Benjamin Pfau, London (GB)
Assigned to DeepMind Technologies Limited, London (GB)
Filed by DeepMind Technologies Limited, London (GB)
Filed on Mar. 8, 2023, as Appl. No. 18/180,754.
Application 18/180,754 is a continuation of application No. 16/302,592, granted, now 11,615,310, previously published as PCT/US2017/033703, filed on May 19, 2017.
Claims priority of provisional application 62/339,785, filed on May 20, 2016.
Prior Publication US 2023/0376771 A1, Nov. 23, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/084 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01)

CPC G06N 3/084 (2013.01) [G06N 3/044 (2023.01); G06N 3/045 (2023.01)]

19 Claims

1. A computer-implemented method for training a target machine learning model having a plurality of model parameters using an optimizer neural network having a plurality of optimizer parameters, the method comprising, at each current iteration of a plurality of iterations:

determining a respective update rule for each of the plurality of model parameters using the optimizer neural network by operating the optimizer neural network independently on each of the plurality of model parameters of the target machine learning model, the determining comprising, for each respective model parameter:

generating a respective parameter-specific input that comprises a gradient of a target objective function of the target machine learning model with respect to the respective model parameter; and

processing the respective parameter-specific input using the optimizer neural network and in accordance with current values of the optimizer parameters to generate a respective optimizer output that specifies a respective parameter-specific update rule for updating the respective model parameter;

applying the update rules generated by the optimizer neural network to the model parameters of the target machine learning model to update values of the model parameters; and

updating the current values of the optimizer parameters by using gradient descent techniques to minimize an optimizer objective function that depends at least on a function value of the target objective function of the target machine learning model computed using the values of the model parameters of the target machine learning model that have been updated at the current iteration.