US 12,462,145 B2
Progressive neural networks
Neil Charles Rabinowitz, Hertfordshire (GB); Guillaume Desjardins, London (GB); Andrei-Alexandru Rusu, London (GB); Koray Kavukcuoglu, London (GB); Raia Thais Hadsell, London (GB); Razvan Pascanu, Letchworth Garden City (GB); James Kirkpatrick, London (GB); and Hubert Josef Soyer, London (GB)
Assigned to DeepMind Technologies Limited, London (GB)
Filed by DeepMind Technologies Limited, London (GB)
Filed on Oct. 2, 2023, as Appl. No. 18/479,775.
Application 18/479,775 is a continuation of application No. 17/201,542, filed on Mar. 15, 2021, granted, now 11,775,804.
Application 17/201,542 is a continuation of application No. 15/396,319, filed on Dec. 30, 2016, granted, now 10,949,734, issued on Mar. 16, 2021.
Claims priority of provisional application 62/339,719, filed on May 20, 2016.
Prior Publication US 2024/0119262 A1, Apr. 11, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/045 (2023.01); G06F 17/16 (2006.01); G06N 3/08 (2023.01)
CPC G06N 3/045 (2023.01) [G06F 17/16 (2013.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
processing inputs using a sequence of deep neural networks (DNNs),
wherein each DNN in the sequence of DNNs has been trained to perform a respective machine learning task of a sequence of machine learning tasks, wherein the sequence of DNNs comprises:
a first DNN that corresponds to a first machine learning task of the sequence of machine learning tasks, wherein
(i) the first DNN comprises a first plurality of indexed layers, and
(ii) each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the respective layer input to generate a respective layer output; and
one or more subsequent DNNs corresponding to one or more respective subsequent machine learning tasks of the sequence of machine learning tasks, wherein
(i) each subsequent DNN comprises a respective plurality of indexed layers, and
(ii) each layer in a respective plurality of indexed layers with index i greater than one receives input from
(i) a preceding layer of the respective subsequent DNN, and
(ii) one or more preceding layers of respective preceding DNNs through respective outputs of respective non-linear lateral connections, wherein a preceding layer is a layer whose index is one less than the index i, and wherein the respective non-linear lateral connections represent a learned, non-linear transformation of the respective layer outputs of the one or more preceding layers of the respective preceding DNNs; and
(iii) each layer in the respective plurality of indexed layers with index i greater than one:
generates a respective activation by processing (i) the input received from the preceding layer of the respective subsequent DNN and (ii) the respective outputs of each of the respective non-linear lateral connections applied to the respective layer outputs of the one or more preceding layers of the respective preceding DNNs;
wherein processing the inputs comprises:
processing a first input for a last machine learning task of the sequence of machine learning task;
processing the first input using the respective DNNs of the sequence of DNNs, and
using a last subsequent DNN in the sequence to generate a last subsequent DNN output for performing the last machine learning task.