US 12,475,054 B1
	Dynamic cache allocation in artificial intelligence accelerator
Ravi Krishnan Venkatesan, Bangalore (IN); Bharat Kumar Daga, Fremont, CA (US); and Eitan Joshua, Ramot Menashe (IL)
Assigned to Habana Labs Ltd., Caeserea (IL)
Filed by Habana Labs Ltd., Caesarea (IL)
Filed on Mar. 28, 2024, as Appl. No. 18/620,278.
Int. Cl. G06F 12/08 (2016.01); G06F 12/0868 (2016.01); G06F 12/0891 (2016.01)

CPC G06F 12/0891 (2013.01) [G06F 12/0868 (2013.01)]

20 Claims

1. A method of cache allocation in an artificial intelligence (AI) accelerator, comprising:

receiving a first data transfer request from a first compute engine in the AI accelerator, the first data transfer request comprising a request to read or write a cache line;

installing the cache line in a first cache associated with the first compute engine;

after receiving the first data transfer request, receiving a second data transfer request from a second compute engine in the AI accelerator, the second data transfer request comprising a request to read the cache line; and

in response to the second data transfer request, installing the cache line in a second cache, wherein the second cache is accessible by the second compute engine and one or more other compute engines in the AI accelerator, and the second cache is closer to a memory than the first cache.