CPC G06N 3/08 (2013.01) [G06F 7/582 (2013.01); G06N 3/047 (2023.01); G06N 3/063 (2013.01)] | 20 Claims |
1. A method for network quantization using multi-dimensional vectors, comprising:
constructing multi-dimensional vectors representing network parameters from a trained neural network model;
quantizing the multi-dimensional vectors to obtain shared quantized vectors as cluster centers;
individually fine-tuning each dimensional element of each of the shared quantized vectors/cluster centers with a stochastic gradient descent method using an average gradient of a network loss function for a respective divided group of the network parameters corresponding to the dimensional element; and
encoding using the shared quantized vectors/cluster centers.
|