US 11,989,656 B2
Search space exploration for deep learning
Chao Xue, Beijing (CN); Yonggang Hu, Markham (CA); Lin Dong, Beijing (CN); and Ke Wei Sun, Beijing (CN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jul. 22, 2020, as Appl. No. 16/935,445.
Prior Publication US 2022/0027739 A1, Jan. 27, 2022
Int. Cl. G06N 3/086 (2023.01); G06N 3/045 (2023.01)
CPC G06N 3/086 (2013.01) [G06N 3/045 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
A computer-implemented method comprising:
obtaining, using a processor, meta features corresponding with a dataset configured to be used for training in a deep learning application;
selecting, using the processor, an initial search space, wherein the initial search space defines a type of deep learning architecture representation to represent two or more neural network architectures and specifies hyperparameters for the two or more neural network architectures;
applying, using the processor, a search strategy to the initial search space, wherein the search strategy performs an evaluation of each of the two or more neural network architectures represented according to the initial search space;
selecting, using the processor, one of the two or more neural network architectures based on a result of the evaluation according to the search strategy;
generating, using the processor, a new search space with new hyperparameters that differ from the hyperparameters specified by the initial search space using an evolutionary algorithm and a mutation type selected from a plurality of mutation types, wherein each mutation type defines one or more changes in the hyperparameters specified by the initial search space, each mutation type is part of a first category or a second category, the generating is performed iteratively for each selected mutation type, and the generating includes, for each iteration:
randomly selecting the mutation type,
performing a check to determine whether the selected mutation type is part of the first category or the second category,
based on the selected mutation type being part of the first category, applying the search strategy to the new search space to re-select a neural network architecture, and
based on the selected mutation type being part of the second category, applying the new hyperparameters to the one of the two or more neural networks to re-select the neural network architecture;
based on a result of applying the search strategy or the new hyperparameters having a sufficient accuracy, acquiring a color image dataset and training the re-selected neural network architecture using the color image dataset;
based on the result having an insufficient accuracy, repeating the iteration until the re-selected neural network architecture has at least the sufficient accuracy, and training the re-selected neural network architecture using the dataset; and
inputting color image data related to an application of interest to the trained re-selected neural network architecture, and performing at least one of feature extraction and classification by the trained re-selected neural network architecture.