US 12,265,492 B2
	Circular buffer for input and output of tensor computations
Liangzhen Lai, Fremont, CA (US); Harshit Khaitan, Fremont, CA (US); Yu Hsin Chen, Santa Clara, CA (US); Kyong Ho Lee, Los Altos, CA (US); and Xu Chen, San Jose, CA (US)
Assigned to Meta Platforms, Inc., Menlo Park, CA (US)
Filed by Meta Platforms, Inc., Menlo Park, CA (US)
Filed on Feb. 21, 2023, as Appl. No. 18/172,030.
Prior Publication US 2024/0281393 A1, Aug. 22, 2024
Int. Cl. G06F 13/28 (2006.01); G06F 13/40 (2006.01)

CPC G06F 13/28 (2013.01) [G06F 13/4027 (2013.01)]

20 Claims

1. A computing system for accelerating machine-learning computations comprising:

a direct memory access component; and

one or more controllers that are communicably connected to the direct memory access component via one or more system buses, wherein each of the one or more controllers is configured to:

receive, from the direct memory access component, a token indicating a data chunk becomes available in a first circular buffer of a pre-determined size;

determine that a computation is to be performed with data including the data chunk based on the token;

generate, in response to the determination, one or more addresses corresponding to one or more data chunks within the first circular buffer that are to be retrieved for the computation, wherein:

when a generated address is greater than a pre-determined maximum associated with the first circular buffer, the generated address is subtracted by the pre-determined size of the first circular buffer; and

when the generated address is less than a pre-determined minimum associated with the first circular buffer, the generated address is added by the pre-determined size of the first circular buffer;

wherein a value of a token counter associated with the first circular buffer is initialized to −k when the direct memory access component determines that k data chunks need to be loaded to the first circular buffer without causing a computation.