US 11,915,138 B2
Method and device for reducing a size of a neural network model
Weifeng Zhang, San Mateo, CA (US); Guoyang Chen, San Mateo, CA (US); Yu Pu, San Mateo, CA (US); Yongzhi Zhang, San Mateo, CA (US); and Yuan Xie, San Mateo, CA (US)
Assigned to Alibaba Group Holding Limited, Grand Cayman (KY)
Filed by ALIBABA GROUP HOLDING LIMITED, Grand Cayman (KY)
Filed on Feb. 18, 2020, as Appl. No. 16/793,993.
Prior Publication US 2021/0256380 A1, Aug. 19, 2021
Int. Cl. G06N 3/082 (2023.01); G06N 3/10 (2006.01)
CPC G06N 3/082 (2013.01) [G06N 3/10 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A method for reducing a size of a neural network model, comprising:
compressing data of the neural network model;
identifying structure information of a vector register, wherein the structure information includes a number of registers included in the vector register;
comparing a number of elements in the compressed data of the neural network model with a first condition, wherein the first condition is determined based on the number of registers in the vector register;
in response to the number of elements satisfying the first condition, associating the compressed data of the neural network model with the vector register to enable loading the compressed data of the neural network model to the vector register;
in response to the number of elements satisfying the first condition, sending an indication to end compression of the data;
comparing the number of elements in the compressed data of the neural network model with a second condition, wherein the second condition is determined based on the number of registers in the vector register, and is different from the first condition; and
adjusting a structure of the vector register in response to the number of elements satisfying the second condition,
wherein satisfying the first condition comprises being equal to or smaller than the number of registers in the vector register, and wherein satisfying the second condition comprises being equal to or smaller than one half of the number of registers in the vector register.