US 12,254,416 B2
Compiler for implementing neural network accelerator
Jitendra Onkar Kolhe, Karnataka (IN); Soumitra Chatterjee, Karnataka (IN); Vaithyalingam Nagendran, Karnataka (IN); and Shounak Bandopadhyay, Karnataka (IN)
Assigned to Hewlett Packard Enterprise Development LP, Spring, TX (US)
Filed by Hewlett Packard Enterprise Development LP, Houston, TX (US)
Filed on Apr. 13, 2021, as Appl. No. 17/229,497.
Claims priority of application No. 202041045388 (IN), filed on Oct. 19, 2020.
Prior Publication US 2022/0121959 A1, Apr. 21, 2022
Int. Cl. G06F 9/44 (2018.01); G06N 3/063 (2023.01); G06N 3/10 (2006.01)
CPC G06N 3/10 (2013.01) [G06N 3/063 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
a neural network accelerator comprising a plurality of processing cores and a memory;
a machine-readable storage medium storing instructions associated with a neural network application;
a complier to compile the instructions and to generate machine-level executable codes executable by the plurality of processing cores, the machine-level executable codes to:
perform a plurality of tensor operations, each tensor operation of the plurality of tensor operations to generate an output tensor using a plurality of corresponding input operands;
select a binary tensor operation from the plurality of tensor operations that receives input operands from a first output tensor of a first tensor operation and a second output tensor of a second tensor operation;
generate, for the input operands of the binary tensor operation, a count of instances of the first output tensor and a count of instances of the second output tensor; and
allocate, in the memory of the neural network accelerator, a buffer space for a first input operand of the input operands in the binary tensor operation based on a difference between the count of instances of the first output tensor and the count of instances of the second output tensor.