US 12,141,699 B2
Systems and methods for providing vector-wise sparsity in a neural network
Maohua Zhu, Sunnyvale, CA (US); Tao Zhang, Sunnyvale, CA (US); Zhenyu Gu, Sunnyvale, CA (US); and Yuan Xie, Sunnyvale, CA (US)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by ALIBABA GROUP HOLDING LIMITED, Grand Cayman (KY)
Filed on Jul. 23, 2020, as Appl. No. 16/937,202.
Claims priority of provisional application 62/893,768, filed on Aug. 29, 2019.
Prior Publication US 2021/0065005 A1, Mar. 4, 2021
Int. Cl. G06N 3/082 (2023.01); G06F 17/16 (2006.01); G06N 3/048 (2023.01); G06N 3/084 (2023.01)
CPC G06N 3/082 (2013.01) [G06F 17/16 (2013.01); G06N 3/048 (2023.01); G06N 3/084 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method for providing vector-wise sparsity in a neural network, comprising:
training the neural network by applying a training data set to the neural network, wherein the neural network comprises an input layer and is configured to perform image recognition, facial recognition, translation, or 3D modeling using input received at the input layer;
dividing a matrix associated with the trained neural network into a plurality of vectors;
selecting a first subset of non-zero elements from the plurality of vectors to form a pruned matrix by selecting a predetermined number of non-zero elements in each vector;
re-training the trained neural network using the pruned matrix to reduce one or more associated loss functions by modifying one or more elements of the pruned matrix or modifying one or more activation functions of one or more nodes of the neural network; and
in response to a preset sparsity level being reached, outputting the pruned matrix for executing the re-trained neural network using the pruned matrix.