CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01)] | 17 Claims |
1. A neural network model processing method, the neural network model processing method comprising:
obtaining a first low-bit neural network model through training, wherein the first low-bit neural network model comprises at least three operation layers, wherein the at least three operation layers comprise a first operation layer and a second operation layer, wherein each of the at least three operation layers comprises at least one operation, wherein one or more values of one or more of a parameter or data used for the at least one operation are represented using N bits, and wherein N is a positive integer less than eight;
compressing the first low-bit neural network model to obtain a second low-bit neural network model, wherein the second low-bit neural network model comprises at least two operation layers, wherein the at least two operation layers comprise a third operation layer, wherein the third operation layer is equivalent to a combination of the first operation layer and the second operation layer, and wherein an operation layer other than the third operation layer in the at least two operation layers is the same as an operation layer other than the first operation layer and the second operation layer in the at least three operation layers;
searching the at least three operation layers for the first operation layer and the second operation layer;
combining the first operation layer and the second operation layer to obtain the third operation layer, wherein an input of the first operation layer is the same as an input of the third operation layer, wherein an output of the first operation layer is an input of the second operation layer, and wherein an output of the second operation layer is the same as an output of the third operation layer; and
constructing the second low-bit neural network model based on the third operation layer and the operation layer other than the first operation layer and the second operation layer in the at least three operation layers.
|