US 12,073,317 B2
Method and system for processing a neural network
Ao Ren, San Mateo, CA (US); Tao Zhang, San Mateo, CA (US); Yuhao Wang, San Mateo, CA (US); and Yuan Xie, San Mateo, CA (US)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by ALIBABA GROUP HOLDING LIMITED, Grand Cayman (KY)
Filed on Jan. 7, 2020, as Appl. No. 16/736,412.
Prior Publication US 2021/0209462 A1, Jul. 8, 2021
Int. Cl. G06N 3/08 (2023.01); G06F 17/16 (2006.01); G06N 3/02 (2006.01); G06V 10/82 (2022.01)
CPC G06N 3/08 (2013.01) [G06F 17/16 (2013.01); G06N 3/02 (2013.01); G06V 10/82 (2022.01); G06T 2207/20 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method for processing a neural network associated with an input matrix having a first number of elements, comprising:
dividing the input matrix into a plurality of vectors, each vector having a second number of elements;
grouping the plurality of vectors into a first group of vectors and a second group of vectors;
assigning vectors in the first group to a plurality of buckets according to a position of a key element in each vector of the first group;
pruning the first group of vectors and the second group of vectors; and
executing the neural network using the first group of pruned vectors and the second group of pruned vectors, wherein the neural network is executed using the first group of pruned vectors in parallel threads corresponding to the plurality of buckets,
wherein grouping the plurality of vectors into the first group of vectors and the second group of vectors further comprises:
determining a pruning ratio for the input matrix;
determining parameters of the first group of vectors and the second group of vectors based on the first number, the second number, and the pruning ratio; and
grouping the plurality of vectors into the first group of vectors and the second group of vectors based on the determined parameters,
the parameters comprising at least one of a bucket size for the plurality of buckets in the first group, a number of empty vectors in the first group, a size of the second group, or a number of non-zero elements to be retained in the second group.