US 12,306,903 B1
Performance of tensor operations
Gavin Uberti, Kirkland, WA (US)
Assigned to ETCHED.AI INC., San Jose, CA (US)
Filed by ETCHED.AI INC., San Jose, CA (US)
Filed on Oct. 22, 2024, as Appl. No. 18/922,941.
Int. Cl. G06F 17/16 (2006.01)
CPC G06F 17/16 (2013.01) 22 Claims
OG exemplary drawing
 
1. A method of performing tensor operations, the method comprising:
loading a first tensor into a plurality of processing devices, the first tensor split into a plurality of first tensor tiles that are distributed among the plurality of processing devices, the plurality of processing devices further including portions of a second tensor split into a plurality of second tensor tiles that are distributed among the plurality of processing devices;
performing a tensor operation with the first tensor and the second tensor using the plurality of processing devices to generate an intermediate tensor that is split in a plurality of intermediate tensor tiles distributed among the plurality of processing devices;
after performing the tensor operation with the first tensor and the second tensor, transferring one or more of the plurality of intermediate tensor tiles amongst one or more of the plurality of processing devices without any of the plurality of processing devices including the entire intermediate tensor; and
after transferring the one or more of the plurality of intermediate tensor tiles, performing, using the plurality of processing devices, a tensor operation with the intermediate tensor and a third tensor, which is split into a plurality of third tensor tiles that are distributed among the plurality of processing devices, to generate a fourth tensor.