US 12,271,819 B2
	Machine learning network model compression
Jiafeng Zhu, Pleasanton, CA (US); Wei Wei, Plano, TX (US); Jianle Chen, San Diego, CA (US); Wei Wang, Santa Clara, CA (US); and Jie Shen, Santa Clara, CA (US)
Assigned to Huawei Technologies Co., Ltd., Shenzhen (CN)
Filed by Huawei Technologies Co., Ltd., Shenzhen (CN)
Filed on Jul. 9, 2021, as Appl. No. 17/371,590.
Application 17/371,590 is a continuation of application No. PCT/US2019/040723, filed on Jul. 5, 2019.
Claims priority of provisional application 62/790,387, filed on Jan. 9, 2019.
Prior Publication US 2021/0342694 A1, Nov. 4, 2021
Int. Cl. G06N 3/082 (2023.01); G06N 3/04 (2023.01); H03M 7/30 (2006.01); G06F 11/34 (2006.01)

CPC G06N 3/082 (2013.01) [G06N 3/04 (2013.01); H03M 7/6005 (2013.01); G06F 11/34 (2013.01)]

29 Claims

1. A computer-implemented method for performing model compression, the method comprising:

compressing a machine learning (ML) network model comprising a multiple layer structure to produce a compressed ML network model, the compressed ML network model maintaining the multiple layer structure of the ML network model; and

generating a model file for the compressed ML network model, the model file comprising the compressed ML network model and decoding information for enabling the ML network model to be decompressed and executed layer-by-layer.