| CPC G06N 3/10 (2013.01) [G06N 3/063 (2013.01)] | 20 Claims |

|
1. A system comprising:
a neural network accelerator comprising a plurality of processing cores and a memory;
a machine-readable storage medium storing instructions associated with a neural network application;
a complier to compile the instructions and to generate machine-level executable codes executable by the plurality of processing cores, the machine-level executable codes to:
perform a plurality of tensor operations, each tensor operation of the plurality of tensor operations to generate an output tensor using a plurality of corresponding input operands;
select a binary tensor operation from the plurality of tensor operations that receives input operands from a first output tensor of a first tensor operation and a second output tensor of a second tensor operation;
generate, for the input operands of the binary tensor operation, a count of instances of the first output tensor and a count of instances of the second output tensor; and
allocate, in the memory of the neural network accelerator, a buffer space for a first input operand of the input operands in the binary tensor operation based on a difference between the count of instances of the first output tensor and the count of instances of the second output tensor.
|