US 12,425,047 B2
Methods and apparatus to perform weight and activation compression and decompression
Nilesh Jain, Portland, OR (US); Menachem Adelman, Haifa (IL); Raanan Sade, Kibutz Sarid (IL); Ravishankar Iyer, Portland, OR (US); Rajesh Poornachandran, Portland, OR (US); and Yash Akhauri, Uttar Pradesh (IN)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by INTEL CORPORATION, Santa Clara, CA (US)
Filed on Sep. 23, 2021, as Appl. No. 17/483,693.
Claims priority of application No. 202141026534 (IN), filed on Jun. 15, 2021.
Prior Publication US 2022/0012592 A1, Jan. 13, 2022
Int. Cl. H03M 7/30 (2006.01); G06F 17/16 (2006.01); G06N 3/082 (2023.01)
CPC H03M 7/70 (2013.01) [G06F 17/16 (2013.01); G06N 3/082 (2013.01)] 18 Claims
OG exemplary drawing
 
1. An apparatus comprising:
memory;
instructions in the apparatus; and
processor circuitry to execute the instructions to:
execute a compression operation on a first portion of a matrix of weights for a neural network to obtain first compressed data, the first portion of the matrix is a sparse portion;
execute the compression operation on a second portion of the matrix of weights for a neural network to obtain second compressed data, the second portion of the matrix is a sparse portion;
determine first meta-data associated with the first compressed data and second meta-data associated with the second compressed data, a first portion of the first meta-data indicative of whether the first compressed data is compressed, a second portion of the first meta-data indicative of a cache size of the first compressed data, and a third portion of the first meta-data indicative of the compression operation executed to obtain the first compressed data;
cause the first meta-data and the second meta-data to be stored contiguously in a first row of a memory;
cause the first compressed data to be stored in a second row in the memory; and
cause the second compressed data to be stored in a third row in the memory, wherein the second row and the third row are consecutive rows.