CPC G06N 3/082 (2013.01) [G06F 1/3296 (2013.01); G06F 9/3877 (2013.01); G06F 12/0875 (2013.01); G06F 13/16 (2013.01); G06F 16/285 (2019.01); G06N 3/04 (2013.01); G06N 3/044 (2023.01); G06N 3/048 (2023.01); G06N 3/063 (2013.01); G06N 3/084 (2013.01); G06F 2212/452 (2013.01); G06F 2213/0026 (2013.01)] | 16 Claims |
1. A data quantization method, comprising:
grouping weights of a neural network;
performing a clustering operation on each group of weights by using a clustering algorithm, dividing a group of weights into m classes, computing a center weight for each class, and replacing all the weights in each class by the center weights, where m is a positive integer;
encoding the center weight to get a weight codebook and a weight dictionary,
wherein the grouping includes grouping into a group, an inter-layer-based grouping, and an intra-layer-based grouping, and the method further includes grouping convolutional layers into one group, grouping fully connected layers by the intra-layer-based grouping, and grouping LSTM layers by the inter-layer-based grouping.
|