US 12,112,397 B2
	Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
Eriko Nurvitadhi, Hillsoboro, OR (US); Balaji Vembu, Folsom, CA (US); Nicolas C. Galoppo Von Borries, Portland, OR (US); Rajkishore Barik, Santa Clara, CA (US); Tsung-Han Lin, Campbell, CA (US); Kamal Sinha, Rancho Cordova, CA (US); Nadathur Rajagopalan Satish, Santa Clara, CA (US); Jeremy Bottleson, Rancho Cordova, CA (US); Farshad Akhbari, Chandler, AZ (US); Altug Koker, El Dorado Hills, CA (US); Narayan Srinivasa, Portland, OR (US); Dukhwan Kim, San Jose, CA (US); Sara S. Baghsorkhi, San Jose, CA (US); Justin E. Gottschlich, Santa Clara, CA (US); Feng Chen, Shanghai (CN); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Kevin Nealis, San Jose, CA (US); Xiaoming Chen, Shanghai (CN); and Anbang Yao, Beijing (CN)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jun. 14, 2023, as Appl. No. 18/334,733.
Application 18/334,733 is a continuation of application No. 17/541,413, filed on Dec. 3, 2021, granted, now 11,727,527.
Application 17/541,413 is a continuation of application No. 16/928,353, filed on Jul. 14, 2020, granted, now 11,210,760, issued on Dec. 28, 2021.
Application 16/928,353 is a continuation of application No. 16/197,783, filed on Nov. 21, 2018, granted, now 10,769,748, issued on Sep. 8, 2020.
Application 16/197,783 is a continuation of application No. 15/581,182, filed on Apr. 28, 2017, granted, now 10,186,011, issued on Jan. 22, 2019.
Prior Publication US 2023/0394616 A1, Dec. 7, 2023
Int. Cl. G06T 1/20 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06N 3/04 (2023.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2023.01); G06N 3/08 (2023.01); G06N 3/084 (2023.01)

CPC G06T 1/20 (2013.01) [G06F 9/3001 (2013.01); G06F 9/3017 (2013.01); G06F 9/3851 (2013.01); G06F 9/3887 (2013.01); G06F 9/3895 (2013.01); G06N 3/04 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01)]

20 Claims

1. A parallel processor comprising:

a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, wherein the multiple types of compute units include a first sparse compute unit configured for input at a first level of sparsity and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity;

a plurality of processing resources coupled with the hardware scheduler, the plurality of processing resources including the first sparse compute unit; and

hybrid memory circuitry coupled with the plurality of processing resources, the hybrid memory circuitry including a memory controller, a memory interface, and the second sparse compute unit.