US 11,790,233 B2
Generating larger neural networks
Ian Goodfellow, Mountain View, CA (US); Tianqi Chen, Seattle, WA (US); and Jonathon Shlens, San Francisco, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Jun. 29, 2020, as Appl. No. 16/915,502.
Application 16/915,502 is a continuation of application No. 15/349,901, filed on Nov. 11, 2016, granted, now 10,699,191.
Claims priority of provisional application 62/254,629, filed on Nov. 12, 2015.
Prior Publication US 2020/0401896 A1, Dec. 24, 2020
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/08 (2023.01); G06N 3/04 (2023.01); G06N 3/082 (2023.01); G06N 3/045 (2023.01)
CPC G06N 3/082 (2013.01) [G06N 3/04 (2013.01); G06N 3/045 (2023.01)] 27 Claims
OG exemplary drawing
 
1. A method performed by one or more computers, the method comprising:
identifying an original neural network structure for an original neural network, the original neural network being configured to generate neural network outputs from neural network inputs, the original neural network structure comprising a plurality of original neural network units, each original neural network unit having respective parameters, and each of the parameters of each of the original neural network units having a respective original value;
generating a larger neural network having a larger neural network structure, the larger neural network structure comprising:
(i) the plurality of original neural network units, and
(ii) a plurality of additional neural network units not in the original neural network structure, each additional neural network unit having respective parameters;
initializing values of the parameters of the original neural network units and the additional neural network units by setting the values of the parameters of the original neural network units and the additional neural network units to values that result in the larger neural network generating, for any particular neural network input, the same neural network output for the particular neural network input as would be generated by the original neural network by processing the particular neural network input in accordance with the original parameter values for the original neural network units; and
training the larger neural network to determine trained values of the parameters of the original neural network units and the additional neural network units from the initialized values.