US 12,461,789 B2
Redistributing tensor elements between machine learning computing units
David Alexander Majnemer, Sunnyvale, CA (US); Ravi Narayanaswami, San Jose, CA (US); Dong Hyuk Woo, San Jose, CA (US); and Carrell Daniel Killebrew, Sunnyvale, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 17/629,437
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Oct. 7, 2020, PCT No. PCT/US2020/054554
§ 371(c)(1), (2) Date Jan. 24, 2022,
PCT Pub. No. WO2021/071930, PCT Pub. Date Apr. 15, 2021.
Claims priority of provisional application 62/911,678, filed on Oct. 7, 2019.
Prior Publication US 2022/0245453 A1, Aug. 4, 2022
Int. Cl. G06F 9/50 (2006.01); G06F 7/76 (2006.01); G06N 3/08 (2023.01)
CPC G06F 9/5066 (2013.01) [G06F 7/76 (2013.01); G06F 9/5016 (2013.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
distributing, by a controller, tensor elements of an N-dimensional tensor among a plurality of computing units of a computation system, wherein each computing unit performs computations using a subset of the tensor elements distributed to the computing unit;
receiving, by the controller, an instruction to redistribute the tensor elements of the N-dimensional tensor among the computing units;
in response to receiving the instruction, redistributing, by each computing unit, the subset of tensor elements previously distributed to the computing unit to one or more computing units of the computation system, including, for each particular computing unit of the computation system:
accessing, by the particular computing unit, redistribution partitioning data that specifies, for each computing unit, the tensor elements that are to be stored by the computing unit after redistributing the tensor elements;
for each tensor element previously distributed to the particular computing unit:
determining, by the particular computing unit, a global linearized index value for the tensor element based on a multi-dimensional index for the tensor element in the N-dimensional tensor, the multi-dimensional index for the tensor element including, for each dimension of the N-dimensional tensor, an index value that corresponds to a position of the tensor element along that dimension of the N-dimensional tensor;
determining, by the particular computing unit and using the redistribution partitioning data and the global linearized index value for the tensor element, a destination computing unit of the computation system to which the tensor element is to be redistributed; and
sending, by the particular computing unit, the tensor element to the destination computing unit.