CPC G06V 10/765 (2022.01) [G06V 10/776 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01)] | 6 Claims |
1. An image classification method for maximizing mutual information, comprising:
acquiring a training image;
maximizing mutual information between the training image and a neural network architecture, and automatically determining the neural network architecture and parameters of the neural network;
processing image data to be classified using the obtained neural network to obtain an image classification result; and
dividing the acquired training image into two parts; wherein the maximizing the mutual information between the training image and the neural network architecture, and automatically determining the neural network architecture and parameters of the neural network, comprises:
constructing a super-network and an architecture-generating network, respectively performing data processing thereon to obtain network parameters of the super-network and parameters of the architecture-generating network, and constructing a target network; and
inputting all training images into the target network, generating a predicted image category label, and according to the predicted image category label and a real image category label, calculating a cross entropy loss of the image classification, and training the target network until convergence for image classification;
wherein the constructing a super-network and an architecture-generating network, respectively performing data processing thereon to obtain a network parameter of the super-network and parameters of the architecture-generating network, and constructing a target network comprising:
S1: constructing cells based on all possible image classification operations,
constructing a super-network with the cells, wherein
the super-network is formed by stacking the cells containing all possible image classification operations;
S2: constructing an architecture-generating network based on a convolution neural network, sampling from a standard Gaussian distribution to obtain a sampling value as an input of the architecture-generating network, and obtaining an output of the architecture-generating network through forward propagation;
sampling a noise from the standard Gaussian distribution; and
summing an output of the architecture-generating network and the sampled noise as an architecture parameter of the super-network;
S3: inputting a first part of the training image into the super-network to generate a prediction category label;
calculating image classification cross entropy loss according to the prediction category label and the real category label; and
updating the network parameter of the super-network according to the image classification cross entropy loss with a gradient descent method;
S4: inputting a second part of the training image into the super-network, maximizing the mutual information of the image data and the architecture parameter of the super-network, and determining a lower bound of the mutual information, wherein
the lower bound of the mutual information is the cross entropy loss of the posterior distribution of the architecture parameter and the posterior distribution of the image data, the cross entropy loss is calculated, and the parameters of the architecture-generating network are updated with the gradient descent method; and
repeating S2-S4 to iteratively update the network parameter of the super-network and the parameters of the architecture-generating network continuously until convergence, and stacking the updated new cells to construct a target network.
|