| CPC H04L 45/742 (2013.01) [G06F 12/1081 (2013.01); G06F 13/28 (2013.01); H04L 45/60 (2013.01); H04L 49/9068 (2013.01)] | 33 Claims |

|
1. Network interface controller circuitry configurable for use in a host node and in association with a device driver, the host node comprising at least one graphics processing unit (GPU)-accessible memory, at least one host memory, and at least one host fabric, the host node to be communicatively coupled via at least one multi-switch fabric to a remote system, the remote system comprising at least one other GPU-accessible memory, at least one other host memory, and at least one other host fabric, the network interface controller circuitry comprising:
network interface circuitry for use in Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCE) packet data communication with the remote system via the at least one multi-switch fabric, the ROCE packet data communication to indicate at least one RDMA write to the host node from the remote system and/or at least one RDMA read from the host node to the remote system, the ROCE packet data communication to be initiated in response, at least in part, to at least one host application request; and
programmable circuitry to perform operations comprising:
in event that the ROCE packet data communication indicates the at least one RDMA write, directly writing, via the at least one host fabric, received packet data to the at least one GPU-accessible memory;
in event that the ROCE packet data communication indicates the at least one RDMA read, directly reading, via the at least one host fabric, other data from the at least one GPU-accessible memory that is to be provided to the remote system via the ROCE packet data communication; and
encryption, decryption, and compression-related host central processing unit (CPU) offload operations;
wherein:
the writing and the reading are to be performed in a manner that bypasses both (1) host CPU and/or host operating system (OS) in the writing and the reading, and (2) copying of the received packet data and the other data to the at least one host memory of the host node;
the writing and/or the reading are configurable to comprise use of direct data placement (DDP);
the writing and/or the reading are configurable to comprise use of address translation;
the address translation is to be implemented, at least in part, using the device driver;
portions of the received packet data and/or the other data are to be routed to their destinations via respective fabric-associated routings;
the respective fabric-associated routings are configurable to be mutually different from each other, at least in part; and
the at least one multi-switch fabric is to communicatively couple multiple switches associated with the host node and the remote system.
|