US 12,141,683 B2
Performance scaling for dataflow deep neural network hardware accelerators
Arnab Raha, Santa Clara, CA (US); Debabrata Mohapatra, Santa Clara, CA (US); Gautham Chinya, Sunnyvale, CA (US); Guruguhanathan Venkataramanan, Livermore, CA (US); Sang Kyun Kim, San Jose, CA (US); Deepak Mathaikutty, Santa Clara, CA (US); Raymond Sung, San Francisco, CA (US); and Cormac Brick, San Francisco, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Apr. 30, 2021, as Appl. No. 17/246,341.
Prior Publication US 2021/0271960 A1, Sep. 2, 2021
Int. Cl. G06F 17/10 (2006.01); G06F 9/30 (2018.01); G06N 3/04 (2023.01); G06N 3/063 (2023.01)
CPC G06N 3/063 (2013.01) [G06F 9/3001 (2013.01); G06N 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
19. A method, comprising:
receiving sparsity information for input activations (IFs) and weights (FLs) to be used for executing a neural network (NN);
determining an average combined sparsity value for the IFs and FLs based on the sparsity information; and
activating or deactivating, based on the average combined sparsity value, a plurality of multiply-and-accumulate (MAC) units in a processing element (PE) of a PE array implemented by hardware, wherein the PE includes a plurality of MAC units and a register file (RF) comprising a plurality of RF instances.