US 12,229,867 B2
	Graphics architecture including a neural network pipeline
Hugues Labbe, Granite Bay, CA (US); Darrel Palke, Portland, OR (US); Sherine Abdelhak, Beaverton, OR (US); Jill Boyce, Portland, OR (US); Varghese George, Folsom, CA (US); Scott Janus, Loomis, CA (US); Adam Lake, Portland, OR (US); Zhijun Lei, Hillsboro, OR (US); Zhengmin Li, Hillsboro, OR (US); Mike MacPherson, Portland, OR (US); Carl Marshall, Portland, OR (US); Selvakumar Panneer, Portland, OR (US); Prasoonkumar Surti, Folsom, CA (US); Karthik Veeramani, Hillsboro, OR (US); Deepak Vembar, Portland, OR (US); and Vallabhajosyula Srinivasa Somayazulu, Portland, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on May 1, 2023, as Appl. No. 18/310,015.
Application 18/310,015 is a continuation of application No. 17/500,631, filed on Oct. 13, 2021, granted, now 11,676,322.
Application 17/500,631 is a continuation of application No. 16/537,140, filed on Aug. 9, 2019, granted, now 11,151,769, issued on Oct. 19, 2021.
Claims priority of provisional application 62/717,603, filed on Aug. 10, 2018.
Claims priority of provisional application 62/717,685, filed on Aug. 10, 2018.
Claims priority of provisional application 62/717,593, filed on Aug. 10, 2018.
Prior Publication US 2023/0360307 A1, Nov. 9, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06T 17/20 (2006.01); G06N 3/08 (2023.01); G06T 1/20 (2006.01); G06T 1/60 (2006.01); G06T 15/00 (2011.01); G06T 15/40 (2011.01)

CPC G06T 15/005 (2013.01) [G06N 3/08 (2013.01); G06T 1/20 (2013.01); G06T 1/60 (2013.01); G06T 15/40 (2013.01); G06T 17/20 (2013.01)]

20 Claims

1. A graphics processor comprising:

a block of execution resources;

a cache memory;

a cache memory prefetcher having an adjustable prefetch pattern that is adjustable to a learned prefetch pattern, the learned prefetch pattern learned by a neural network; and

circuitry including a programmable neural network unit, the programmable neural network unit comprising a network hardware block including circuitry to perform neural network operations and activation operations for a layer of the neural network, the programmable neural network unit addressable by cores within the block of graphics cores and the neural network hardware block configured to perform operations associated with a neural network configured to determine a prefetch pattern for the cache memory prefetcher, wherein the prefetch pattern determined for the cache memory prefetcher is the learned prefetch pattern, the cache memory prefetcher is to prefetch data according to the learned prefetch pattern, and the learned prefetch pattern is based at least in part on a memory access pattern associated with a workload executed via the block of execution resources, wherein the neural network hardware block is configured, via the neural network, to:

recognize the workload executed via the block of execution resources as the workload associated with the learned prefetch pattern based at least in part on the memory access pattern associated with the workload; and

configure the prefetch pattern for the cache memory prefetcher to the learned prefetch pattern for use with the workload, the learned prefetch pattern one of a plurality of learned prefetch patterns associated with a plurality of workloads.