US 12,124,383 B2
	Systems and methods for cache optimization
Altug Koker, El Dorado Hills, CA (US); Joydeep Ray, Folsom, CA (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Abhishek Appu, El Dorado Hills, CA (US); Aravindh Anantaraman, Folsom, CA (US); Valentin Andrei, San Jose, CA (US); Durgaprasad Bilagi, Folsom, CA (US); Varghese George, Folsom, CA (US); Brent Insko, Portland, OR (US); Sanjeev Jahagirdar, Folsom, CA (US); Scott Janus, Loomis, CA (US); Pattabhiraman K, Bangalore (IN); SungYe Kim, Folsom, CA (US); Subramaniam Maiyuran, Gold River, CA (US); Vasanth Ranganathan, El Dorado Hills, CA (US); Lakshminarayanan Striramassarma, Folsom, CA (US); and Xinmin Tian, Union City, CA (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Jul. 12, 2022, as Appl. No. 17/862,739.
Application 17/862,739 is a continuation of application No. 17/428,529, previously published as PCT/US2020/022833, filed on Mar. 14, 2020.
Claims priority of provisional application 62/819,337, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,361, filed on Mar. 15, 2019.
Claims priority of provisional application 62/819,435, filed on Mar. 15, 2019.
Claims priority of provisional application 62/935,729, filed on Nov. 15, 2019.
Prior Publication US 2022/0350751 A1, Nov. 3, 2022
Int. Cl. G06F 12/00 (2006.01); G06F 12/0875 (2016.01); G06F 12/0891 (2016.01); G06F 12/123 (2016.01); G06T 1/60 (2006.01)

CPC G06F 12/123 (2013.01) [G06F 12/0875 (2013.01); G06F 12/0891 (2013.01); G06T 1/60 (2013.01); G06F 2212/302 (2013.01)]

24 Claims

1. A graphics processing unit (GPU) comprising:

a plurality of groups of cores, each group of cores including:

a plurality of cores of a first type; and

a plurality of cores of a second type, wherein the plurality of cores of the second type are tensor cores;

a plurality of combined level 1 (L1) cache and shared memory units, each corresponding to a different group of cores of the plurality of groups of cores;

a level 2 (L2) cache to be shared by the plurality of groups of cores;

a plurality of memory controllers to couple the GPU to a memory; and

a cache controller associated with the L2 cache, in response to an instruction from a first core of the plurality of groups of cores, to:

apply a second cache eviction importance to a data in the L2 cache instead of a first cache eviction importance while the data in the L2 cache is to remain useable, wherein the second cache eviction importance is indicated by the instruction.