US 12,079,728 B2
Device, method, and program for quantitatively analyzing structure of a neural network
Chihiro Watanabe, Tokyo (JP); Kaoru Hiramatsu, Tokyo (JP); and Kunio Kashino, Tokyo (JP)
Assigned to NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
Appl. No. 16/980,380
Filed by NIPPON TELEGRAPH AND TELEPHONE CORPORATION, Tokyo (JP)
PCT Filed Mar. 7, 2019, PCT No. PCT/JP2019/009114
§ 371(c)(1), (2) Date Sep. 12, 2020,
PCT Pub. No. WO2019/176731, PCT Pub. Date Sep. 19, 2019.
Claims priority of application No. 2018-047140 (JP), filed on Mar. 14, 2018.
Prior Publication US 2021/0248422 A1, Aug. 12, 2021
Int. Cl. G06F 18/21 (2023.01); G06F 18/23 (2023.01); G06N 3/084 (2023.01); G06V 30/182 (2022.01); G06V 30/19 (2022.01); G06V 30/10 (2022.01)
CPC G06N 3/084 (2013.01) [G06F 18/217 (2023.01); G06F 18/23 (2023.01); G06V 30/1823 (2022.01); G06V 30/19173 (2022.01); G06V 30/10 (2022.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method for improving generalization performance of a neural network, the method comprising:
receiving a neural network, wherein the neural network is a trained neural network for generating output data from input data based on a set of learning data including the input data and the output data, wherein the input data is in multi-dimensional vector form, and the neural network includes a plurality of layers;
receiving a structure of a cluster of the neural network, wherein the structure of the cluster indicates a plurality of predetermined units in a layer of the plurality of layers of the neural network, the cluster represents a set of extracted units in the layer of the plurality of layers in the trained neural network based on a connection relationship between vertices of adjacent layers of the layer according to a plurality of edges with connection weights, the plurality of edges connect respective units in the cluster with other units of the adjacent layers of the layer in the plurality of layers in the trained neural network, and the structure of the cluster is in multi-dimensional vector form;
determining a first relationship between one or more dimensions of the input data of the neural network and a dimension of the cluster, wherein the first relationship is based on a sum of squared errors between the first set of data and a second set of data,
the first set of data represents output values of the respective units of the set of extracted units in the cluster using a data value of one of a plurality of dimensions of the input data in the set of learning data as input to the trained neural network, and
the second set of data represents output values of the respective units of the set of extracted units in the cluster using an averaged value of data values of the plurality of dimensions of the input data in the set of learning data as input to the trained neural network;
determining a second relationship between the dimension of the cluster and a dimension of the output data of the neural network, wherein the second relationship is based on a squared error between a first data value of the dimension of the output data and a second value of the dimension of the output data,
the first data value of the dimension of the output data represents output from the predetermined units in the cluster, and
the second data value of the dimension of the output data represents an average data value of output from the predetermined units as the output from each of the predetermined units in the cluster; and
retraining the trained neural network and improving the generalization performance of the neural network by using one or more noise data as training data, the one or more noise data correspond to at least one of the determined first relationship and the determined second relationship, and the retraining further comprises adding the one or more noise data to output from respective units of the extracted units in the cluster of the trained neural network via backpropagation.