US 11,940,934 B2
Efficient and concurrent model execution
Marie Mai Nguyen, Pittsburgh, PA (US); Rekha Pitchumani, Oak Hill, VA (US); Zongwang Li, Dublin, CA (US); Yang Seok Ki, Palo Alto, CA (US); and Krishna Teja Malladi, San Jose, CA (US)
Assigned to SAMSUNG ELECTRONICS CO., LTD., (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Jan. 27, 2022, as Appl. No. 17/586,767.
Claims priority of provisional application 63/288,513, filed on Dec. 10, 2021.
Prior Publication US 2023/0185739 A1, Jun. 15, 2023
Int. Cl. G06F 12/02 (2006.01); G06F 13/16 (2006.01)
CPC G06F 13/1668 (2013.01) 18 Claims
OG exemplary drawing
 
1. An accelerator, comprising:
a circuit to execute instructions on a first data batch to produce a first processed data batch, a data model including the first data batch and a second data batch, the circuit including one or more cores, the circuit configured to execute the instructions on the data model using the first data batch and the second data batch;
a first tier storage including a first capacity and a first latency;
a second tier storage including a second capacity and a second latency, the second capacity larger than the first capacity, the second latency being slower than the first latency;
a bus to transfer at least one of the first data batch or the first processed data batch between the first tier storage and the second tier storage; and
a prefetcher to transfer the first data batch from the second tier storage to the first tier storage over the bus based at least in part on data placement information from a host regarding the first data batch.