CPC G06N 3/084 (2013.01) [G06F 17/18 (2013.01); G06N 20/00 (2019.01)] | 19 Claims |
1. A method for determining quantization parameters in a neural network, comprising:
obtaining an analyzing result of each type of data to be quantized, wherein the data to be quantized includes at least one type of data among neurons, weights, gradients, and biases of the neural network;
determining a corresponding quantization parameter according to the analyzing result of each type of the data to be quantized and a data bit width corresponding to the data to be quantized, wherein the quantization parameter is used by an artificial intelligence processor to perform corresponding quantization on data involved in a process of neural network operation; wherein the quantization parameter includes a point position parameter and a scaling coefficient; and
adjusting the data bit width according to a corresponding quantization error after the data to be quantized has been quantized and the data to be quantized using the quantization parameters, wherein:
the quantization error is determined according to quantized data and corresponding pre-quantized data,
the quantization error is compared with a threshold to obtain a comparison result, and
the data bit width is adjusted according to the comparison result,
wherein the threshold includes at least one from the group of a first threshold and a second threshold.
|