US 11,811,429 B2
Variational dropout with smoothness regularization for neural network model compression
Wei Jiang, Palo Alto, CA (US); Wei Wang, Palo Alto, CA (US); and Shan Liu, Palo Alto, CA (US)
Assigned to TENCENT AMERICA LLC, Palo Alto, CA (US)
Filed by TENCENT AMERICA LLC, Palo Alto, CA (US)
Filed on Sep. 28, 2020, as Appl. No. 17/034,739.
Claims priority of provisional application 62/939,060, filed on Nov. 22, 2019.
Claims priority of provisional application 62/915,337, filed on Oct. 15, 2019.
Prior Publication US 2021/0111736 A1, Apr. 15, 2021
Int. Cl. H03M 7/00 (2006.01); H03M 7/30 (2006.01); G06N 3/084 (2023.01); G06N 5/046 (2023.01); G06F 18/211 (2023.01); G06F 18/214 (2023.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01)
CPC H03M 7/70 (2013.01) [G06F 18/211 (2023.01); G06F 18/2155 (2023.01); G06N 3/084 (2013.01); G06N 5/046 (2013.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); H03M 7/60 (2013.01)] 20 Claims
OG exemplary drawing
 
8. A computer system for compressing a deep neural network model, the computer system comprising:
one or more computer-readable non-transitory storage media configured to store computer program code; and
one or more computer processors configured to access said computer program code and operate as instructed by said computer program code, said computer program code including:
quantizing and entropy-coding code configured to cause the one or more computer processors to quantize and entropy-code weight coefficients associated with the deep neural network;
smoothing code configured to cause the one or more computer processors to locally smooth the quantized and entropy-coded weight coefficients; and
compressing code configured to cause the one or more computer processors to compress the smoothed weight coefficients based on applying a variational dropout to the weight coefficients.