US 12,346,817 B2
Neural architecture search
Barret Zoph, Sunnyvale, CA (US); Yun Jia Guan, Stanford, CA (US); Hieu Hy Pham, Redwood City, CA (US); and Quoc V. Le, Sunnyvale, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Apr. 16, 2021, as Appl. No. 17/232,803.
Application 17/232,803 is a continuation of application No. 16/859,781, filed on Apr. 27, 2020, granted, now 10,984,319, issued on Apr. 20, 2021.
Application 16/859,781 is a continuation of application No. PCT/US2018/058041, filed on Oct. 29, 2018.
Claims priority of provisional application 62/578,361, filed on Oct. 27, 2017.
Prior Publication US 2021/0232929 A1, Jul. 29, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/082 (2023.01); G06N 3/04 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01)
CPC G06N 3/082 (2013.01) [G06N 3/04 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/047 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A method of determining an architecture for a neural network for performing a particular neural network task, the method comprising:
generating, in accordance with current values of a plurality of controller parameters, a batch of output sequences, each output sequence in the batch specifying a respective subset of a plurality of components of a large neural network, wherein the large neural network includes the respective subset of the plurality of components and one or more other components of the plurality of components that are not in the respective subset, wherein the large neural network has a plurality of large network parameters, wherein the large neural network comprises a plurality of layers, and wherein the respective subset of the plurality of components of the large neural network forms a smaller neural network that (i) includes only the respective subset of the plurality of components, (ii) does not include the one or more other components of the plurality of components of the large neural network that are not in the respective subset and (iii) has, for each component in the respective subset, current values of the large neural network parameters for that component;
for each output sequence in the batch:
determining a performance metric of the smaller neural network on the particular neural network task in accordance with current values of the large network parameters for the components in the smaller neural network; and
using, by an updating engine, the performance metrics for the output sequences in the batch to adjust the current values of the controller parameters;
generating, in accordance with the adjusted values of the controller parameters, a new output sequence that specifies a new subset of the plurality of components of the large neural network, the new subset of the plurality of components forming a new neural network; and
training, by a training engine in collaboration with the updating engine, the new neural network with only the components in the new subset specified by the new output sequence on training data to determine adjusted values of the large network parameters for the components in the new subset.