| CPC G06N 3/082 (2013.01) [G06N 3/04 (2013.01); H03M 7/6005 (2013.01); G06F 11/34 (2013.01)] | 29 Claims |

|
1. A computer-implemented method for performing model compression, the method comprising:
compressing a machine learning (ML) network model comprising a multiple layer structure to produce a compressed ML network model, the compressed ML network model maintaining the multiple layer structure of the ML network model; and
generating a model file for the compressed ML network model, the model file comprising the compressed ML network model and decoding information for enabling the ML network model to be decompressed and executed layer-by-layer.
|