CPC G06N 3/084 (2013.01) [G06F 17/18 (2013.01); G06N 20/00 (2019.01)] | 22 Claims |
1. A method for determining quantization parameters in a neural network, comprising:
obtaining an analyzing result of each type of data to be quantized, wherein the data to be quantized includes at least one type of data among neurons, weights, gradients, or biases of the neural network; and
determining a corresponding quantization parameter according to the analyzing result of each type of the data to be quantized and a data bit width corresponding to the data to be quantized, wherein the quantization parameter is used by an artificial intelligence processor to perform corresponding quantization on data involved in a process of neural network operation; wherein the quantization parameter is a first scaling coefficient and quantizing target data by using the corresponding quantization parameter, wherein a feature of the target data is similar to that of the data to be quantized,
wherein the analyzing result is a maximum value and a minimum value of, or a maximum absolute value of, each type of data to be quantized,
wherein the maximum absolute value is determined according to the maximum value and the minimum value of each type of data to be quantized,
wherein the quantization parameter is determined according to either the maximum value of each type of data to be quantized and the minimum value of each type of data to be quantized, or the maximum absolute value of each type of data, together with the data bit width.
|