US 12,066,975 B2
	Cache structure and utilization
Altug Koker, El Dorado Hills, CA (US); Lakshminarayanan Striramassarma, Folsom, CA (US); Aravindh Anantaraman, Folsom, CA (US); Valentin Andrei, San Jose, CA (US); Abhishek R. Appu, El Dorado Hills, CA (US); Sean Coleman, Folsom, CA (US); Varghese George, Folsom, CA (US); K Pattabhiraman, Bangalore KA (IN); Mike MacPherson, Portland, OR (US); Subramaniam Maiyuran, Gold River, CA (US); ElMoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Vasanth Ranganathan, El Dorado Hills, CA (US); Joydeep Ray, Folsom, CA (US); S Jayakrishna P, Bangalore KA (IN); and Prasoonkumar Surti, Folsom, CA (US)
Assigned to INTEL CORPORATION, Santa Clara, CA (US)
Appl. No. 17/429,291
Filed by Intel Corporation, Santa Clara, CA (US)
PCT Filed Mar. 14, 2020, PCT No. PCT/US2020/022849 § 371(c)(1), (2) Date Aug. 6, 2021, PCT Pub. No. WO2020/190811, PCT Pub. Date Sep. 24, 2020.
Claims priority of provisional application 62/819,337, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,435, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,361, filed on Mar. 15, 2019.
Prior Publication US 2022/0138104 A1, May 5, 2022
Int. Cl. G06F 12/00 (2006.01); G06F 7/544 (2006.01); G06F 7/575 (2006.01); G06F 7/58 (2006.01); G06F 9/30 (2018.01); G06F 9/38 (2018.01); G06F 9/50 (2006.01); G06F 12/02 (2006.01); G06F 12/06 (2006.01); G06F 12/0802 (2016.01); G06F 12/0804 (2016.01); G06F 12/0811 (2016.01); G06F 12/0862 (2016.01); G06F 12/0866 (2016.01); G06F 12/0871 (2016.01); G06F 12/0875 (2016.01); G06F 12/0882 (2016.01); G06F 12/0888 (2016.01); G06F 12/0891 (2016.01); G06F 12/0893 (2016.01); G06F 12/0895 (2016.01); G06F 12/0897 (2016.01); G06F 12/1009 (2016.01); G06F 12/128 (2016.01); G06F 15/78 (2006.01); G06F 15/80 (2006.01); G06F 17/16 (2006.01); G06F 17/18 (2006.01); G06T 1/20 (2006.01); G06T 1/60 (2006.01); H03M 7/46 (2006.01); G06N 3/08 (2023.01); G06T 15/06 (2011.01)

CPC G06F 15/7839 (2013.01) [G06F 7/5443 (2013.01); G06F 7/575 (2013.01); G06F 7/588 (2013.01); G06F 9/3001 (2013.01); G06F 9/30014 (2013.01); G06F 9/30036 (2013.01); G06F 9/3004 (2013.01); G06F 9/30043 (2013.01); G06F 9/30047 (2013.01); G06F 9/30065 (2013.01); G06F 9/30079 (2013.01); G06F 9/3887 (2013.01); G06F 9/5011 (2013.01); G06F 9/5077 (2013.01); G06F 12/0215 (2013.01); G06F 12/0238 (2013.01); G06F 12/0246 (2013.01); G06F 12/0607 (2013.01); G06F 12/0802 (2013.01); G06F 12/0804 (2013.01); G06F 12/0811 (2013.01); G06F 12/0862 (2013.01); G06F 12/0866 (2013.01); G06F 12/0871 (2013.01); G06F 12/0875 (2013.01); G06F 12/0882 (2013.01); G06F 12/0888 (2013.01); G06F 12/0891 (2013.01); G06F 12/0893 (2013.01); G06F 12/0895 (2013.01); G06F 12/0897 (2013.01); G06F 12/1009 (2013.01); G06F 12/128 (2013.01); G06F 15/8046 (2013.01); G06F 17/16 (2013.01); G06F 17/18 (2013.01); G06T 1/20 (2013.01); G06T 1/60 (2013.01); H03M 7/46 (2013.01); G06F 9/3802 (2013.01); G06F 9/3818 (2013.01); G06F 9/3867 (2013.01); G06F 2212/1008 (2013.01); G06F 2212/1021 (2013.01); G06F 2212/1044 (2013.01); G06F 2212/302 (2013.01); G06F 2212/401 (2013.01); G06F 2212/455 (2013.01); G06F 2212/60 (2013.01); G06N 3/08 (2013.01); G06T 15/06 (2013.01)]

17 Claims

1. An apparatus comprising:

one or more processors including a graphics processing unit (GPU); and

a memory for storage of data for processing by the one or more processors;

wherein the GPU includes a GPU cache to cache data from the memory for use by the GPU;

wherein the apparatus is to provide for dynamic overfetching of cache lines for the GPU cache, including:

dynamically selecting an overfetch boundary to be applied for one or more received read requests for the GPU,

receiving a read request requesting data for the GPU and accessing the GPU cache for the requested data, and

upon a miss in the GPU cache, overfetching data from memory or a higher level cache to the GPU cache in addition to fetching the requested data;

wherein the overfetching of data is based at least in part on the selected overfetch boundary for the one or more read requests, and provides for data is to be prefetched extending to the selected overfetch boundary; and

wherein the apparatus is further to provide for eviction of data from the GPU cache with the selected overfetch boundary being maintained for the eviction, including, upon a determination that the requested data is to be evicted from the GPU cache, further evicting the overfetched data from the GPU cache according to the selected overfetch boundary applied for the one or more read requests.