CPC G06N 3/08 (2013.01) [G06N 3/04 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
receiving a neural network model that includes a first tensor operation between a first tensor and a second tensor;
dividing the first tensor operation into a set of sub-operations, wherein each sub-operation of the set of sub-operations generates a portion of a final output of the first tensor operation, and wherein dividing the first tensor operation comprises one of:
assigning each sub-operation to a respective portion of the first tensor,
assigning each sub-operation to a respective portion of the second tensor, or
assigning each sub-operation to both a respective portion of the first tensor and a respective portion of the second tensor; and
generating instructions for performing individual sub-operations of the set of sub-operations on respective computing engines of a plurality of computing engines.
|