US 12,340,299 B2
Sparsity-based neural network mapping to computing units in a system-on-chip
Hee Jun Park, San Diego, CA (US); and Colin Beaton Verrilli, Apex, NC (US)
Assigned to QUALCOMM Incorporated, San Diego, CA (US)
Filed by QUALCOMM Incorporated, San Diego, CA (US)
Filed on Mar. 5, 2021, as Appl. No. 17/194,202.
Prior Publication US 2022/0284271 A1, Sep. 8, 2022
Int. Cl. G06N 3/063 (2023.01); G06F 9/50 (2006.01); G06F 11/30 (2006.01); G06F 11/34 (2006.01)
CPC G06N 3/063 (2013.01) [G06F 9/5094 (2013.01); G06F 11/3062 (2013.01); G06F 11/3452 (2013.01)] 28 Claims
OG exemplary drawing
 
1. A method for an artificial neural network, comprising:
receiving a set of input values to be convolved with a plurality of kernels via a plurality of computing units of a system-on-chip (SOC);
detecting a temperature associated with each of multiple computing units of the plurality of computing units of the SOC;
mapping the plurality of kernels to the plurality of computing units of the SOC based on the detected temperature associated with each of the multiple computing units and a sparsity of each of the plurality of kernels;
performing convolution operations of the set of input values with the plurality of kernels using the plurality of computing units, wherein a most sparse kernel of the plurality of kernels is convolved with the set of input values on a computing unit associated with a greatest temperature of the multiple computing units; and
generating an inference based on the convolution operations.