US 12,293,274 B2
Method and apparatus for unification based coding for neural network model compression
Wei Wang, Palo Alto, CA (US); Wei Jiang, Palo Alto, CA (US); and Shan Liu, Palo Alto, CA (US)
Assigned to TENCENT AMERICA LLC, Palo Alto, CA (US)
Filed by TENCENT AMERICA LLC, Palo Alto, CA (US)
Filed on Jul. 1, 2021, as Appl. No. 17/365,367.
Claims priority of provisional application 63/089,443, filed on Oct. 8, 2020.
Prior Publication US 2022/0114414 A1, Apr. 14, 2022
Int. Cl. H03M 7/00 (2006.01); G06F 17/00 (2019.01); G06N 3/04 (2023.01); H03M 7/30 (2006.01)
CPC G06N 3/04 (2013.01) [H03M 7/30 (2013.01); H03M 7/6011 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of unification based coding for neural network model compression, the method being performed by at least one processor, and the method comprising:
receiving a layer uniform flag indicating whether a quantized weight of an input neural network is encoded using a uniform coding method;
determining whether the quantized weight is encoded using the uniform coding method, based on the received layer uniform flag;
based on the quantized weight being determined to be encoded using the uniform coding method, encoding the quantized weight using the uniform coding method; and
based on the quantized weight being determined to not be encoded using the uniform coding method, encoding the quantized weight using a non-uniform coding method.