US 11,900,114 B2
Systems and methods to skip inconsequential matrix operations
Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); William Rash, Saratoga, CA (US); Subramaniam Maiyuran, Gold River, CA (US); Varghese George, Folsom, CA (US); and Rajesh Sankaran, Portland, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Aug. 1, 2022, as Appl. No. 17/878,427.
Application 17/878,427 is a continuation of application No. 16/453,724, filed on Jun. 26, 2019, granted, now 11,403,097, issued on Aug. 2, 2022.
Prior Publication US 2023/0070579 A1, Mar. 9, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 9/30 (2018.01)
CPC G06F 9/30069 (2013.01) [G06F 9/3001 (2013.01); G06F 9/30021 (2013.01); G06F 9/30083 (2013.01); G06F 9/30101 (2013.01)] 25 Claims
OG exemplary drawing
 
1. A processor comprising:
a matrix operations accelerator comprising a two-dimensional grid of circuits;
decode circuitry to decode a single instruction having fields to specify an opcode and locations of a first source matrix, a second source matrix, and a destination matrix that is a single two-dimensional tile register in the matrix operations accelerator, the opcode indicating that execution circuitry is to cause the matrix operations accelerator to detect close to but non zero values of corresponding elements of the first source matrix and the second source matrix that would generate inconsequential results when operated on, skip operations that would generate inconsequential results based on the detected close to but non zero values of the corresponding elements by disabling a corresponding circuit of the two-dimensional grid of circuits, operate on other elements of the first source matrix with a corresponding other element of the second source matrix by a corresponding circuit of the two-dimensional grid of circuits to generate a resultant, and store the resultant in a corresponding element in the single two-dimensional tile register; and
the execution circuitry to execute the single instruction as per the opcode.