| CPC G06N 3/082 (2013.01) [G06F 9/5027 (2013.01); G06F 15/80 (2013.01); G06N 3/063 (2013.01)] | 7 Claims |

|
1. A method of accelerating training of a low-power, high-performance artificial neural network (ANN), the method comprising:
(a) performing fine-grained pruning and coarse-grained pruning to generate sparsity in weights by a pruning unit in a convolution core of a cluster in a lower-power, high-performance ANN trainer, wherein
the fine-grained pruning generates a random sparsity pattern by replacing values with small magnitudes with zeros, and
the coarse-grained pruning calculates similarities between weights or magnitudes of the weights on an output channel basis and replaces similar consecutive weights or consecutive weights with the small magnitudes with consecutive zeros;
(b) selecting and performing dual zero skipping according to input sparsity, output sparsity, and the sparsity of weights by the convolution core,
wherein, when the convolution core performs the dual zero skipping by using the sparsity of weights, the convolution core skips zeroes in weight data by
skipping computations using the consecutive zeros caused by the coarse-grained pruning at once, and
skipping computations using random zeroes caused by the fine-grained pruning one at a time; and
(c) restricting access to a weight memory during training by allowing a deep neural network (DNN) computation core and a weight pruning core to share weights retrieved from a memory by the convolution core.
|