US 12,293,431 B2
Sparse optimizations for a matrix accelerator architecture
Joydeep Ray, Folsom, CA (US); Scott Janus, Loomis, CA (US); Varghese George, Folsom, CA (US); Subramaniam Maiyuran, Gold River, CA (US); Altug Koker, El Dorado Hills, CA (US); Abhishek Appu, El Dorado Hills, CA (US); Prasoonkumar Surti, Folsom, CA (US); Vasanth Ranganathan, El Dorado Hills, CA (US); Valentin Andrei, San Jose, CA (US); Ashutosh Garg, Folsom, CA (US); Yoav Harel, Carmichael, CA (US); Arthur Hunter, Jr., Cameron Park, CA (US); SungYe Kim, Folsom, CA (US); Mike Macpherson, Portland, OR (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); William Sadler, Folsom, CA (US); Lakshminarayanan Striramassarma, Folsom, CA (US); and Vikranth Vemulapalli, Folsom, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on May 2, 2023, as Appl. No. 18/310,688.
Application 18/310,688 is a continuation of application No. 17/303,654, filed on Jun. 3, 2021, granted, now 11,676,239.
Application 17/303,654 is a continuation of application No. 17/064,427, filed on Oct. 6, 2020, granted, now 11,113,784, issued on Sep. 7, 2021.
Application 17/064,427 is a continuation of application No. PCT/US2020/022846, filed on Mar. 14, 2020.
Claims priority of provisional application 62/935,670, filed on Nov. 15, 2019.
Claims priority of provisional application 62/819,337, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,435, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,361, filed on Mar. 15, 2019.
Prior Publication US 2023/0351543 A1, Nov. 2, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 1/20 (2006.01); G06F 7/544 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 9/50 (2006.01); G06F 12/0806 (2016.01); G06F 15/80 (2006.01); G06F 17/16 (2006.01); G06N 3/048 (2023.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01)
CPC G06T 1/20 (2013.01) [G06F 7/5443 (2013.01); G06F 9/30036 (2013.01); G06F 9/3887 (2013.01); G06F 9/3888 (2023.08); G06F 9/38885 (2023.08); G06F 9/5027 (2013.01); G06F 12/0806 (2013.01); G06F 15/8046 (2013.01); G06F 17/16 (2013.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A graphics processor comprising:
a processing cluster including a plurality of processing resources that are communicatively coupled via a data interconnect, a processing resource of the plurality of processing resources including:
first circuitry to execute instructions; and
second circuitry to detect a zero value data element within output generated by the first circuitry, the output including a plurality of data elements, generate metadata to indicate a location of the zero value data element within the plurality of data elements, and compress the plurality of data elements based at least in part on the metadata, the metadata including a bitfield having a plurality of bits that correspond with the plurality of data elements.