| CPC G06N 3/04 (2013.01) [H03M 7/30 (2013.01); H03M 7/6011 (2013.01)] | 20 Claims |

|
1. A method of unification based coding for neural network model compression, the method being performed by at least one processor, and the method comprising:
receiving a layer uniform flag indicating whether a quantized weight of an input neural network is encoded using a uniform coding method;
determining whether the quantized weight is encoded using the uniform coding method, based on the received layer uniform flag;
based on the quantized weight being determined to be encoded using the uniform coding method, encoding the quantized weight using the uniform coding method; and
based on the quantized weight being determined to not be encoded using the uniform coding method, encoding the quantized weight using a non-uniform coding method.
|