US 12,175,246 B2
Systems and methods for performing matrix compress and decompress instructions
Dan Baum, Haifa (IL); Michael Espig, Newberg, OR (US); James Guilford, Northborough, MA (US); Wajdi K. Feghali, Boston, MA (US); Raanan Sade, Portland, OR (US); Christopher J. Hughes, Santa Clara, CA (US); Robert Valentine, Kiryat Tivon (IL); Bret Toll, Hillsboro, OR (US); Elmoustapha Ould-Ahmed-Vall, Gilbert, AZ (US); Mark J. Charney, Lexington, MA (US); Vinodh Gopal, Westborough, MA (US); Ronen Zohar, Sunnyvale, CA (US); and Alexander F. Heinecke, San Jose, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Sep. 1, 2023, as Appl. No. 18/460,497.
Application 18/460,497 is a continuation of application No. 17/672,253, filed on Feb. 15, 2022, granted, now 11,748,103, issued on Sep. 5, 2023.
Application 17/672,253 is a continuation of application No. 16/934,003, filed on Jul. 20, 2020, granted, now 11,249,761, issued on Feb. 15, 2022.
Application 16/934,003 is a continuation of application No. 16/144,902, filed on Sep. 27, 2018, granted, now 10,719,323, issued on Jul. 21, 2020.
Prior Publication US 2024/0045690 A1, Feb. 8, 2024
Int. Cl. G06F 9/30 (2018.01); G06F 9/38 (2018.01)
CPC G06F 9/30178 (2013.01) [G06F 9/30036 (2013.01); G06F 9/3013 (2013.01); G06F 9/30145 (2013.01); G06F 9/3802 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A system comprising:
a memory to store instructions;
execution circuitry to execute the instructions to perform operations to compress a first tile of a first source matrix to generate a first compressed tile, the first source matrix comprising a plurality of integer data elements and the first tile comprising a first subset of the integer data elements, wherein the first tile is to be compressed by packing one or more non-zero-valued integer data elements of the first subset of integer data elements over zero-valued integer data elements and storing a matrix position of the one or more non-zero-valued integer data element in an index; and
matrix multiplication circuitry to multiply the first compressed tile and a second tile of a second source matrix, the second tile comprising a second subset of integer data elements of the second source matrix, the matrix multiplication circuitry comprising:
a plurality of multiply-accumulate circuits to perform a plurality of fused multiply-add operations to multiply the second subset of integer data elements of the second source matrix by corresponding non-zero-valued integer data elements of the first source matrix identified based on the index to generate a plurality of products, and to add groups of the plurality of products to generate corresponding result data elements of a result matrix.