US 12,094,531 B2
Caching techniques for deep learning accelerator
Aliasger Tayeb Zaidy, Seattle, WA (US); Patrick Alan Estep, Rowlett, TX (US); and David Andrew Roberts, Wellesley, MA (US)
Assigned to Micron Technology, Inc., Boise, ID (US)
Filed by Micron Technology, Inc., Boise, ID (US)
Filed on Jan. 11, 2021, as Appl. No. 17/146,314.
Prior Publication US 2022/0223201 A1, Jul. 14, 2022
Int. Cl. G06F 3/06 (2006.01); G06F 12/0862 (2016.01); G06F 12/0897 (2016.01); G06N 3/063 (2023.01); G06N 3/08 (2023.01); G11C 11/54 (2006.01)
CPC G11C 11/54 (2013.01) [G06F 12/0862 (2013.01); G06F 12/0897 (2013.01); G06N 3/063 (2013.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A device, comprising:
a plurality of processing units configured to execute instructions and perform at least matrix computations of an artificial neural network via execution of the instructions;
a local memory coupled to the processing units and configured to store at least operands of the instructions during operations of the processing units in execution of the instructions;
a memory configured as a buffer;
a random access memory; and
a logic circuit coupled to the buffer, the local memory, and the random access memory;
wherein the instructions include a first instruction to fetch an item from the random access memory to the local memory; the first instruction includes a field related to caching the item in the buffer; and during execution of the first instruction the logic circuit is configured to determine whether to load the item through the buffer based at least in part on the field specified in the first instruction.