US 11,922,535 B2
	Compute optimization mechanism for deep neural networks
Prasoonkumar Surti, Folsom, CA (US); Narayan Srinivasa, Portland, OR (US); Feng Chen, Shanghai (CN); Joydeep Ray, Folsom, CA (US); Ben J. Ashbaugh, Folsom, CA (US); Nicolas C. Galoppo Von Borries, Portland, OR (US); Eriko Nurvitadhi, Hillsboro, OR (US); Balaji Vembu, Folsom, CA (US); Tsung-Han Lin, Campbell, CA (US); Kamal Sinha, Rancho Cordova, CA (US); Rajkishore Barik, Santa Clara, CA (US); Sara S. Baghsorkhi, San Jose, CA (US); Justin E. Gottschlich, Santa Clara, CA (US); Altug Koker, El Dorado Hills, CA (US); Nadathur Rajagopalan Satish, Santa Clara, CA (US); Farshad Akhbari, Chandler, AZ (US); Dukhwan Kim, San Jose, CA (US); Wenyin Fu, Folsom, CA (US); Travis T. Schluessler, Hillsboro, OR (US); Josh B. Mastronarde, Sacramento, CA (US); Linda L. Hurd, Cool, CA (US); John H. Feit, Folsom, CA (US); Jeffery S. Boles, Folsom, CA (US); Adam T. Lake, Portland, OR (US); Karthik Vaidyanathan, Berkeley, CA (US); Devan Burke, Portland, OR (US); Subramaniam Maiyuran, Gold River, CA (US); and Abhishek R. Appu, El Dorado Hills, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Feb. 13, 2023, as Appl. No. 18/168,207.
Application 18/168,207 is a continuation of application No. 17/741,934, filed on May 11, 2022, granted, now 11,593,910.
Application 17/741,934 is a continuation of application No. 17/385,693, filed on Jul. 26, 2021, granted, now 11,334,962, issued on May 17, 2022.
Application 17/385,693 is a continuation of application No. 17/145,885, filed on Jan. 11, 2021, granted, now 11,348,198, issued on May 31, 2022.
Application 17/145,885 is a continuation of application No. 15/819,093, filed on Nov. 21, 2017, granted, now 10,902,547, issued on Jan. 26, 2021.
Application 15/819,093 is a continuation of application No. 15/494,886, filed on Apr. 24, 2017, granted, now 10,417,731, issued on Sep. 17, 2019.
Prior Publication US 2023/0260072 A1, Aug. 17, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 1/20 (2006.01); G06F 9/455 (2018.01); G06F 9/50 (2006.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2023.01); G06N 3/084 (2023.01); G06F 8/41 (2018.01)

CPC G06T 1/20 (2013.01) [G06F 9/45533 (2013.01); G06F 9/5061 (2013.01); G06F 9/5094 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06N 3/084 (2013.01); G06F 8/41 (2013.01); G06F 2009/45583 (2013.01)]

17 Claims

1. A graphics processing unit comprising one or more multiprocessors, at least one of the one or more multiprocessors including:

a register file to store a plurality of different types of operands; and

a plurality of processing cores, including:

a first set of processing cores of a first type to perform multi-dimensional matrix operations on a first set of operands in a first set of registers of the register file, wherein the first set of processing cores of the first type includes circuitry to execute instructions to perform matrix operations on the first set of operands in the first set of registers of the register file and the first set of processing cores of the first type are associated with a first memory channel of a memory device coupled with the at least one of the one or more multiprocessors; and

a second set of processing cores of a second type, the second set of processing cores being different from the first set of processing cores, the second set of processing cores to perform general purpose graphics processing unit (GPGPU) operations on a second set of operands in a second set of registers of the register file, wherein the second set of processing cores of the second type are associated with a second memory channel of the memory device coupled with the at least one of the one or more multiprocessors, the second memory channel is distinct from the first memory channel, and the memory device is external to the at least one of the one or more multiprocessors.