US 12,008,067 B2
Sparse matrix multiplication acceleration mechanism
Subramaniam Maiyuran, Gold River, CA (US); Mathew Nevin, Fair Oaks, CA (US); Jorge Parra, El Dorado Hills, CA (US); Ashutosh Garg, Folsom, CA (US); Shubra Marwaha, Santa Clara, CA (US); and Shubh Shah, Folsom, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Nov. 16, 2021, as Appl. No. 17/527,324.
Application 17/527,324 is a continuation of application No. 16/561,715, filed on Sep. 5, 2019, granted, now 11,188,618.
Prior Publication US 2022/0171827 A1, Jun. 2, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 17/16 (2006.01); G06F 7/487 (2006.01); G06F 9/30 (2018.01); G06F 13/16 (2006.01)
CPC G06F 17/16 (2013.01) [G06F 7/4876 (2013.01); G06F 9/3001 (2013.01); G06F 9/30036 (2013.01); G06F 13/1673 (2013.01); G06F 2207/3892 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a systolic multiplier, including:
a first set of first in first out (FIFO) buffers to store data in a first input matrix;
a second set of FIFO buffers to store data in a second input matrix;
a plurality of processing elements (PEs) to receive the matrix data from the first set of FIFO buffers and the second set of FIFO buffers and perform multiply-add operations on first input matrix and the second input matrix;
a plurality of storage elements to locally store intermediate matrix multiplication values; and
sparse matrix acceleration circuitry to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce the multiply-add operations to be performed by the matrix systolic multiplier, including swapping rows with other rows in a sub-matrix for each of a plurality of sub-matrices of the first input matrix.