US 11,714,998 B2
Accelerating neural networks with low precision-based multiplication and exploiting sparsity in higher order bits
Avishaii Abuhatzera, Amir (IL); Om Ji Omer, Bangalore (IN); Ritwika Chowdhury, Bengaluru (IN); and Lance Hacking, Spanish Fork, UT (US)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jun. 23, 2020, as Appl. No. 16/909,295.
Claims priority of application No. 202041019060 (IN), filed on May 5, 2020.
Prior Publication US 2020/0320375 A1, Oct. 8, 2020
Int. Cl. G06N 3/063 (2023.01); G06N 3/08 (2023.01); G06N 3/04 (2023.01); G06N 3/088 (2023.01)
CPC G06N 3/063 (2013.01) [G06N 3/0454 (2013.01); G06N 3/088 (2013.01)] 25 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a processor comprising:
a re-encoder to re-encode a first input number of signed input numbers represented in a first precision format as part of a machine learning model, the first input number re-encoded into two signed input numbers of a second precision format, wherein the first precision format is a higher precision format than the second precision format;
a multiply-add circuit to perform operations in the first precision format using the two signed input numbers of the second precision format; and
a sparsity hardware circuit to reduce computing on zero values at the multiply-add circuit, wherein the sparsity hardware circuit comprises a finite state machine (FSM) to determine whether any of the two signed input numbers corresponding to most significant bits (MSBs) comprise zero values and to cause the multiply-add circuit to skip the operations on numbers comprising zero values;
wherein the processor to execute the machine learning model using the re-encoder, the multiply-add circuit, and the sparsity hardware circuit.