| CPC G06F 12/128 (2013.01) [G06F 12/084 (2013.01); G06F 12/0895 (2013.01); G06N 3/044 (2023.01); G06N 3/045 (2023.01); G06N 3/063 (2013.01); G06N 3/084 (2013.01); G06F 2212/601 (2013.01); G06F 2212/6042 (2013.01); G06F 2212/6046 (2013.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A system comprising:
a last level cache (LLC) dynamically divided into private caches each corresponding to compute engines performing concurrent compute operations on different deep learning (DL) layers of a DL neural network;
DL hardware circuitry communicatively coupled to the LLC by an interconnect, wherein the DL hardware circuitry comprising the compute engines to execute the concurrent compute operations on the DL layers using the LLC, wherein each compute engine corresponds to a different one of the DL layers; and
a system cache controller communicably coupled to the DL hardware circuitry and the LLC, the system cache controller to:
receive, from the DL hardware circuitry, a cache access request from a first compute engine of the compute engines performing the concurrent compute operations for a first DL layer of the DL neural network; and
direct the cache access request to a first private cache of the private caches of the LLC, the first private cache corresponding to the first compute engine.
|