| CPC G06F 9/3887 (2013.01) [G06F 9/30109 (2013.01); G06F 9/30141 (2013.01); G06F 9/3824 (2013.01); G06F 17/16 (2013.01)] | 24 Claims |

|
1. A method of performing matrix multiplication of a first matrix and a second matrix using a computer system including N single instruction multiple data (SIMD) engines and N corresponding output register sets, wherein N is a number equal to or greater than two, and wherein each of the output register sets includes a corresponding plurality of output registers, the method comprising:
identifying a plurality of non-zero entries included in the first matrix, wherein each of the identified plurality of non-zero entries has a corresponding column address and a corresponding row address within the first matrix;
for each of the identified plurality of non-zero entries, using the corresponding row address of the non-zero entry to identify a corresponding one of the N SIMD engines and a corresponding one of the N output register sets to process the non-zero entry;
sorting the identified plurality of non-zero entries to select N non-zero entries, each having a different identified corresponding one of the N output register sets;
routing each of the selected N non-zero entries to the identified corresponding one of the N SIMD engines to perform multiplication operations with entries of the second matrix, wherein each of the N SIMD engines generates a plurality of products;
for each of the selected N non-zero entries, using the corresponding row address of the non-zero entry to identify one of the plurality of output registers within the identified corresponding one of the N output register sets; and
for each of the selected N non-zero entries, using the identified corresponding one of the N SIMD engines to perform accumulate operations by accessing the identified one of the output registers within the identified corresponding one of the N output register sets.
|