US 12,093,148 B2
Neural network quantization parameter determination method and related products
Shaoli Liu, Shanghai (CN); Xiaofu Meng, Shanghai (CN); Xishan Zhang, Shanghai (CN); Jiaming Guo, Shanghai (CN); Di Huang, Shanghai (CN); Yao Zhang, Shanghai (CN); Yu Chen, Shanghai (CN); and Chang Liu, Shanghai (CN)
Assigned to Shanghai Cambricon Information Technology Co., Ltd, Shanghai (CN)
Filed by Shanghai Cambricon Information Technology Co., Ltd, Shanghai (CN)
Filed on Dec. 10, 2021, as Appl. No. 17/547,972.
Application 17/547,972 is a continuation of application No. PCT/CN2019/106801, filed on Sep. 19, 2019.
Claims priority of application No. 201910505239.7 (CN), filed on Jun. 12, 2019; application No. 201910515355.7 (CN), filed on Jun. 14, 2019; application No. 201910528537.8 (CN), filed on Jun. 18, 2019; and application No. 201910570125.0 (CN), filed on Jun. 27, 2019.
Prior Publication US 2022/0261634 A1, Aug. 18, 2022
Int. Cl. G06F 11/14 (2006.01); G06N 3/047 (2023.01); G06N 3/08 (2023.01)
CPC G06F 11/1476 (2013.01) [G06N 3/047 (2023.01); G06N 3/08 (2013.01); G06F 2201/81 (2013.01); G06F 2201/865 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method for adjusting a data bit width in a convolution neural network layer during a neural network computation, comprising:
obtaining a data bit width used to perform a quantization on data to be quantized, wherein the data to be quantized includes at least one type of neurons, weights, gradients, or biases, the data bit width indicates the data bit width of the quantized data after the data to be quantized being quantized;
performing a quantization on a group of data to be quantized based on the data bit width to convert the group of data to be quantized to a group of quantized data, wherein the group of quantized data has the data bit width;
comparing the group of data to be quantized with the group of quantized data to determine a quantization error correlated with the data bit width;
adjusting the data bit width based on the determined quantization error; and
applying the adjusted data bit width during quantization in the convolution neural network layer.