US 12,254,526 B2
On chip dense memory for temporal buffering
Varghese George, Folsom, CA (US); Altug Koker, El Dorado Hills, CA (US); Aravindh Anantaraman, Folsom, CA (US); Subramaniam Maiyuran, Gold River, CA (US); SungYe Kim, Folsom, CA (US); Valentin Andrei, San Jose, CA (US); Elmoustapha Ould-Ahmed-Vall, Chandler, AZ (US); Joydeep Ray, Folsom, CA (US); Abhishek R. Appu, El Dorado Hills, CA (US); Nicolas C. Galoppo von Borries, Portland, OR (US); Prasoonkumar Surti, Folsom, CA (US); and Mike Macpherson, Portland, OR (US)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Mar. 15, 2019, as Appl. No. 16/355,573.
Prior Publication US 2020/0294182 A1, Sep. 17, 2020
Int. Cl. G06T 1/20 (2006.01); G06F 16/17 (2019.01); G06N 20/00 (2019.01); G06T 1/60 (2006.01)
CPC G06T 1/20 (2013.01) [G06F 16/1724 (2019.01); G06N 20/00 (2019.01); G06T 1/60 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A graphics multiprocessor, comprising:
a plurality of compute engines to perform first computations to generate a first set of data of a process;
cache for storing data; and
a high density memory for temporal buffering that is integrated on a same semiconductor chip with the plurality of compute engines and the cache, the high density memory to receive the first set of data, to temporarily store the first set of data, and to provide the first set of data from the high density memory to the cache during a first time period that is prior to a second time period when the plurality of compute engines request the first set of data for second computations of the process, wherein the plurality of compute engines use the first set of data for second computations to generate a second set of data, wherein the first set of data comprises activation data for a forward pass of the process that will be transferred from the high density memory to the cache prior to when the activation data is needed during a backward pass of the process.