| CPC G06N 20/00 (2019.01) | 22 Claims |

|
1. An apparatus, comprising:
a memory configured to store tensors for Machine Learning (ML) processing; and
one or more processors, configured to:
receive a work plan associated with a subgraph of a ML graph of a ML model, wherein the work plan supports processing of tensors having respective shapes in a selected range of shapes, and wherein a shape of a tensor specifies respective sizes of dimensions of that tensor;
receive an input tensor that was stored in the memory, the input tensor having an actual shape, wherein the actual shape depends on a non-inferable tensor operation in a previously executed subgraph, wherein the non-inferable tensor operation generates a tensor whose shape depends on actual data in one or more tensors input to the non-inferable tensor operation;
based on the actual shape, modify the work plan to produce a modified work plan for processing the input tensor in accordance with the subgraph; and
process the input tensor in accordance with the subgraph, by submitting the modified work plan for execution by one or more of the processors;
wherein at least one of the one or more processors comprises a ML computational engine that is assigned to execute at least part of the modified work plan.
|