US 11,669,732 B2
	Neural network quantization method, device and related products
Yubin Shen, Beijing (CN); Zhibin Guo, Beijing (CN); Xinkai Song, Beijing (CN); and Shaoli Liu, Beijing (CN)
Assigned to CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed by Cambricon Technologies Corporation Limited, Beijing (CN)
Filed on Dec. 11, 2019, as Appl. No. 16/711,376.
Claims priority of application No. 201811654179.7 (CN), filed on Dec. 29, 2018.
Prior Publication US 2020/0210830 A1, Jul. 2, 2020
Int. Cl. G06N 3/08 (2023.01); G06N 3/04 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01)]

14 Claims

1. A neural network quantization method, comprising:

obtaining a weight and input data of a target quantization layer of an original neural network, wherein the target quantization layer includes at least one computation layer of the original neural network;

determining a quantization parameter of a weight of a corresponding layer by using the weight of the target quantization layer of the original neural network; determining a quantization parameter of input data of a corresponding layer by using the input data of the target quantization layer of the original neural network, wherein both the weight and the input data of the target quantization layer follow a principle of not distorting a maximum absolute value; and

quantizing the target quantization layer of the original neural network according to the quantization parameter of the weight and the quantization parameter of the input data to generate a quantized weight and quantized input data.