US 11,676,239 B2
	Sparse optimizations for a matrix accelerator architecture
Joydeep Ray, Folsom, CA (US); Scott Janus, Loomis, CA (US); Varghese George, Folsom, CA (US); Subramaniam Maiyuran, Gold River, CA (US); Altug Koker, El Dorado Hills, CA (US); Abhishek Appu, El Dorado Hills, CA (US); Prasoonkumar Surti, Folsom, CA (US); Vasanth Ranganathan, El Dorado Hills, CA (US); Andrei Valentin, San Jose, CA (US); Ashutosh Garg, Folsom, CA (US); Yoav Harel, Carmichael, CA (US); Arthur Hunter, Jr., Cameron Park, CA (US); SungYe Kim, Folsom, CA (US); Mike Macpherson, Portland, OR (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); William Sadler, Folsom, CA (US); Lakshminarayanan Striramassarma, Folsom, CA (US); and Vikranth Vemulapalli, Folsom, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jun. 3, 2021, as Appl. No. 17/303,654.
Application 17/303,654 is a continuation of application No. 17/064,427, filed on Oct. 6, 2020, granted, now 11,113,784.
Application 17/064,427 is a continuation of application No. PCT/US2020/022846, filed on Mar. 14, 2020.
Claims priority of provisional application 62/935,670, filed on Nov. 15, 2019.
Claims priority of provisional application 62/819,337, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,435, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,361, filed on Mar. 15, 2019.
Prior Publication US 2021/0374897 A1, Dec. 2, 2021
Int. Cl. G06T 1/20 (2006.01); G06F 9/50 (2006.01); G06F 12/0806 (2016.01); G06F 15/80 (2006.01); G06F 17/16 (2006.01); G06F 7/544 (2006.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01); G06N 3/048 (2023.01)

CPC G06T 1/20 (2013.01) [G06F 7/5443 (2013.01); G06F 9/5027 (2013.01); G06F 12/0806 (2013.01); G06F 15/8046 (2013.01); G06F 17/16 (2013.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01)]

20 Claims

1. A general purpose graphics processor comprising:

a processing resource including a matrix accelerator and a decoder, the matrix accelerator including a load filter to bypass a load of a sparse submatrix of an input matrix and the decoder to decode an encoded set of data associated with the input matrix to generate a decoded set of data, the decoder to decode the encoded set of data based on metadata associated with the encoded set of data, wherein the load filter is to bypass the load of the sparse submatrix based on the metadata associated with the encoded set of data,

the load filter configured to bypass a load of a near-sparse submatrix having a limited number of non-zero values; and

the matrix accelerator configured to send a message to indicate bypass of the near-sparse submatrix.