CPC G06F 8/433 (2013.01) [G06F 8/445 (2013.01)] | 20 Claims |
1. A computer-implemented method performed by a compiler, the computer-implemented method comprising:
receiving a description of a neural network model;
generating an intermediate representation of the neural network model representing a data flow graph;
traversing the data flow graph in reverse order from an output of the data flow graph towards an input of the data flow graph;
for each tensor in the data flow graph, adding a tensor live interval to a vector of intervals, wherein the tensor live interval indicates a last-use of a corresponding tensor and a first-definition of the corresponding tensor;
converting the vector of intervals into a binary tree of interval nodes using a median of the vector of intervals as a root of the binary tree of interval nodes;
for each interval node in the binary tree of interval nodes, determining an earliest-first-definition value for a sub-tree rooted at that interval node, and associating the earliest-first-definition value with that interval node;
for each tensor in the data flow graph, querying the binary tree of interval nodes for interferences of the tensor to generate an interference list of the tensor;
performing memory allocation for a buffer memory of an accelerator based on the interference list of the tensor; and
generating machine code based on the memory allocation, wherein the machine code is executed on the accelerator to implement the neural network model.
|