US 12,242,403 B2
Direct access to reconfigurable processor memory
Conrad Alexander Turlik, Palo Alto, CA (US); Sudhakar Dindukurti, Palo Alto, CA (US); Anand Misra, Palo Alto, CA (US); Arjun Sabnis, Palo Alto, CA (US); Milad Sharif, Palo Alto, CA (US); Ravinder Kumar, Palo Alto, CA (US); Joshua Earle Polzin, Palo Alto, CA (US); Arnav Goel, Palo Alto, CA (US); and Steven Dai, Palo Alto, CA (US)
Assigned to SambaNova Systems, Inc., Palo Alto, CA (US)
Filed by SambaNova Systems, Inc., Palo Alto, CA (US)
Filed on Mar. 14, 2023, as Appl. No. 18/121,224.
Claims priority of provisional application 63/321,654, filed on Mar. 18, 2022.
Prior Publication US 2023/0297527 A1, Sep. 21, 2023
Int. Cl. G06F 13/28 (2006.01)
CPC G06F 13/28 (2013.01) [G06F 2213/3808 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A system, comprising:
a first data processing system, comprising:
a first reconfigurable processor with a first reconfigurable processor memory,
a first host that is operatively coupled to the first reconfigurable processor, comprising:
a first host processor including runtime logic with a user space and a kernel space, and
a first host memory that is coupled to the first host processor, and
a first network interface controller (NIC) that is operatively coupled to, a network, the first reconfigurable processor, and the first host processor; and
a second data processing system that is coupled via the network to the first data processing system, comprising:
a second reconfigurable processor with a second reconfigurable processor memory,
a second host that is operatively coupled to the second reconfigurable processor, comprising:
a second host processor, and
a second host memory that is coupled to the second host processor, and
a second network interface controller (NIC) that is operatively coupled to the network, the second reconfigurable processor, and the second host processor;
wherein the first reconfigurable processor is configured to implement a virtual function that uses a virtual address for a memory access operation ;
wherein the first host processor is configured to implement an application programming interface (API) within the runtime logic that includes a first module located in the user space, a second module executing in kernel space, a third module executing in the kernel space, and input/output memory management unit (IOMMU) page tables;
wherein the API translates the virtual address into a physical address using the first module and the second module for virtual addresses targeting the first host memory and additionally using the third module for virtual addresses targeting the first reconfigurable processor memory, wherein the third module translates the virtual address targeting the first reconfigurable processor memory to an input/output virtual address (IOVA), and then uses the IOMMU page tables to translate the IOVA to the physical address; and
wherein the first NIC uses the physical address to initiate a direct memory access operation at the second reconfigurable processor memory or the second host memory that moves data directly between the first reconfigurable processor and the second reconfigurable processor memory or the second host memory, wherein the data bypasses the first host processor and is transferred directly between the first reconfigurable processor and the first NIC.