US 12,288,142 B2
Sparsity-aware compute-in-memory
Ren Li, San Diego, CA (US); Ankit Srivastava, San Diego, CA (US); Seyed Arash Mirhaj, Poway, CA (US); and Sameer Wadhwa, San Diego, CA (US)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Aug. 9, 2021, as Appl. No. 17/397,653.
Prior Publication US 2023/0049323 A1, Feb. 16, 2023
Int. Cl. G06N 20/00 (2019.01); H03K 19/20 (2006.01); H03M 7/30 (2006.01)
CPC G06N 20/00 (2019.01) [H03K 19/20 (2013.01); H03M 7/30 (2013.01)] 28 Claims
OG exemplary drawing
 
1. A method, comprising:
disabling one or more bit cells in a compute-in-memory (CIM) array based on a sparsity of input data for a machine learning model prior to processing the input data;
determining that a sparsity of weight data for the machine learning model exceeds a weight data sparsity threshold;
resequencing the weight data according to the sparsity of the weight data;
disabling one or more bit cells in the CIM array based on the resequenced weight data;
resequencing the input data based on the resequenced weight data;
processing the input data with bit cells not disabled in the CIM array to generate an output value;
applying a compensation to the output value based on the sparsity of the input data to generate a compensated output value; and
outputting the compensated output value.