US 12,205,012 B2
	Method of accelerating training process of neural network and neural network device thereof
Seungwon Lee, Hwaseong-si (KR); Hanmin Park, Seoul (KR); Gunhee Lee, Seoul (KR); Namhyung Kim, Seoul (KR); Joonsang Yu, Seoul (KR); and Kiyoung Choi, Seoul (KR)
Assigned to Samsung Electronics Co., Ltd., Suwon-si (KR); and Seoul National University R&DB Foundation, Seoul (KR)
Filed by Samsung Electronics Co., Ltd, Suwon-si (KR); and Seoul National University R&DB Foundation, Seoul (KR)
Filed on Aug. 26, 2019, as Appl. No. 16/550,498.
Claims priority of provisional application 62/722,395, filed on Aug. 24, 2018.
Claims priority of application No. 10-2018-0163309 (KR), filed on Dec. 17, 2018.
Prior Publication US 2020/0065659 A1, Feb. 27, 2020
Int. Cl. G06N 3/063 (2023.01); G06N 3/04 (2023.01); G06N 3/084 (2023.01)

CPC G06N 3/063 (2013.01) [G06N 3/04 (2013.01); G06N 3/084 (2013.01)]

24 Claims

1. A method of accelerating a training of a neural network through backpropagation process, the method comprising:

in the backward propagation process:

acquiring a bit-vector including bits indicating which of output activations of forward propagation operation of a layer of the neural network are zero or non-zero, the bit-vector having been generated in a process of a forward propagation of the neural network, with input activations of the layer in the backward propagation process respectively corresponding to the output activations;

selecting, based on bits of the bit-vector indicating which of the input activations have values of 0, only non-zero activations and corresponding filters respectively from among the input activations and forward propagation filters of the layer used in the forward propagation operation;

generating backward propagation filters of the layer by rearranging the selected corresponding filters, where there are fewer generated backward propagation filters than forward propagation filters;

generating zero padded activations by performing zero padding on the selected non-zero activations; and

performing the backward propagation using the zero padded activations and the backward propagation filters,

wherein the selecting of the non-zero activations comprises selecting the non-zero activations based on an interpreting of the bits included in the bit-vector for a total number of the selected non-zero activations based on a total number of multipliers available for the performance of the backward propagation, or until the total number of the selected non-zero activations equals the total number of multipliers.