CPC G06F 17/16 (2013.01) [G06F 7/4876 (2013.01); G06F 9/3001 (2013.01); G06F 9/30036 (2013.01); G06F 13/1673 (2013.01); G06F 2207/3892 (2013.01)] | 20 Claims |
1. An apparatus comprising:
a systolic multiplier, including:
a first set of first in first out (FIFO) buffers to store data in a first input matrix;
a second set of FIFO buffers to store data in a second input matrix;
a plurality of processing elements (PEs) to receive the matrix data from the first set of FIFO buffers and the second set of FIFO buffers and perform multiply-add operations on first input matrix and the second input matrix;
a plurality of storage elements to locally store intermediate matrix multiplication values; and
sparse matrix acceleration circuitry to detect zero values in the matrix data and perform one or more optimizations on the matrix data to reduce the multiply-add operations to be performed by the matrix systolic multiplier, including swapping rows with other rows in a sub-matrix for each of a plurality of sub-matrices of the first input matrix.
|