US 11,675,676 B2
Neural network quantization parameter determination method and related products
Shaoli Liu, Shanghai (CN); Xiaofu Meng, Shanghai (CN); Xishan Zhang, Shanghai (CN); and Jiaming Guo, Shanghai (CN)
Assigned to SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD, Shanghai (CN)
Appl. No. 16/622,541
Filed by SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD, Shanghai (CN)
PCT Filed Sep. 19, 2019, PCT No. PCT/CN2019/106754
§ 371(c)(1), (2) Date Dec. 13, 2019,
PCT Pub. No. WO2020/248423, PCT Pub. Date Dec. 17, 2020.
Claims priority of application No. 201910505239.7 (CN), filed on Jun. 12, 2019; application No. 201910515355.7 (CN), filed on Jun. 14, 2019; application No. 201910528537.8 (CN), filed on Jun. 18, 2019; and application No. 201910570125.0 (CN), filed on Jun. 27, 2019.
Prior Publication US 2021/0286688 A1, Sep. 16, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 11/14 (2006.01); G06N 3/08 (2023.01); G06N 3/047 (2023.01)
CPC G06F 11/1476 (2013.01) [G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06F 2201/81 (2013.01); G06F 2201/865 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for quantizing data in a neural network using neural network quantization parameters, comprising:
obtaining an analyzing result of each type of the data to be quantized, wherein the data includes at least one type of neurons, weights, gradients, or biases of the neural network;
determining a corresponding quantization parameter according to the analyzing result of each type of the data to be quantized and a data bit width corresponding to the data to be quantized;
quantizing the data using the corresponding quantization parameter to obtain quantized data;
performing inverse quantization on the quantized data to obtain inverse quantized data, wherein a data format of the inverse quantized data is the same as that of the corresponding pre-quantized data; and
determining a quantization error based on the quantized data and the inverse quantized data.