CPC G06N 3/08 (2013.01) [B25J 9/161 (2013.01); B25J 9/163 (2013.01); G06N 3/04 (2013.01); G06V 40/00 (2022.01)] | 26 Claims |
1. A processor-implemented data processing method in a neural network, the data processing method comprising:
performing an inference operation by implementing a current convolutional layer, of the neural network, provided an input activation map to generate an output activation map, where a kernel weight of the current convolutional layer is a weight quantized to a first representation bit number from a trained kernel weight for the current convolutional layer; and
outputting another activation map that includes activations that are activation results, of the current convolutional layer, quantized to a second representation bit number within a range represented by an activation quantization parameter that includes a first and second parameters,
wherein the first parameter is dependent on first and second thresholds derived from the output activation map, and the second parameter is dependent on the first and second thresholds, and
wherein the quantization of the activation map is dependent on the first parameter and the second parameter.
|