US 11,757,469 B2
Compression technique for deep neural network weights
Prajakt Kulkarni, San Diego, CA (US); Lakshmi Narayana Macha, San Diego, CA (US); and Haoping Xu, North York (CA)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Apr. 1, 2021, as Appl. No. 17/220,620.
Prior Publication US 2022/0321143 A1, Oct. 6, 2022
Int. Cl. H03M 7/30 (2006.01); G06N 3/04 (2023.01)
CPC H03M 7/702 (2013.01) [G06N 3/04 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A method performed in a processor of a computing device, comprising:
receiving a weight data set of binary numbers representing weight values;
generating a first frame payload comprising a compressed first frame of a first subset of the weight values in the weight data set;
generating a first frame header associated with the first frame payload, wherein the first frame header includes a normalization factor indicator for the compressed first frame; and
generating a block of compressed weight data having the first frame payload.