US 12,380,041 B2
Method and apparatus for data transfer between accessible memories of multiple processors in a heterogeneous processing system using two memory to memory transfer operations
Arnav Goel, Palo Alto, CA (US); Neal Sanghvi, Palo Alto, CA (US); Jiayu Bai, Palo Alto, CA (US); Qi Zheng, Palo Alto, CA (US); and Ravinder Kumar, Palo Alto, CA (US)
Assigned to SambaNova Systems, Inc., Palo Alto, CA (US)
Filed by SambaNova Systems, Inc., Palo Alto, CA (US)
Filed on Jan. 19, 2023, as Appl. No. 18/099,006.
Prior Publication US 2024/0248860 A1, Jul. 25, 2024
Int. Cl. G06F 13/00 (2006.01); G06F 9/50 (2006.01); G06F 13/16 (2006.01); G06F 13/28 (2006.01)
CPC G06F 13/1673 (2013.01) [G06F 9/5016 (2013.01); G06F 13/28 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A heterogeneous processing system, comprising:
switch and bus circuitry;
a host processor coupled to a host memory accessible from the host processor without utilizing the switch and bus circuitry, wherein the host processor allocates buffer space within the host memory;
a first processor coupled to a first memory accessible from the first processor without utilizing the switch and bus circuitry;
a second processor coupled to a second memory accessible from the second processor without utilizing the switch and bus circuitry;
a first direct memory access (DMA) engine incorporated within the first processor; and
a second DMA engine incorporated within the second processor;
wherein the switch and bus circuitry communicatively couples the host processor, the host memory, the first DMA engine of the first processor and the second DMA engine of the second processor;
wherein the first processor is configured to execute a first node of a computation graph which creates and stores first data into the first memory without using the switch and bus circuitry; and
wherein the host processor is configured to program the first DMA engine to transfer the first data from the first memory to a first location in the buffer space of the host memory through the switch and bus circuitry and to program the first DMA engine to transfer the first data from the first location in the buffer space of the host memory into the second memory through the switch and bus circuitry.