| CPC G06F 8/443 (2013.01) [G06N 3/10 (2013.01)] | 20 Claims |

|
1. A method of compiling a deep learning model, comprising:
reading metadata from a compiled result, the metadata being extracted from a low-level intermediate representation (IR) generated from a deep learning compilation process for compiling the deep learning model, the metadata indicating a structure of the deep learning model corresponding to the low-level IR, the structure including computation operations and connections connecting the computation operations;
receiving shape information of an input tensor of the deep learning model;
determining a shape of an output tensor of a first computation operation of the computation operations based on the shape information of the input tensor of the deep learning model and the structure of the deep learning model;
tiling the output tensor of the first computation operation into one or more tiles according to the shape of the output tensor of the first computation operation and hardware limitations of a processor executing the deep learning model; and
patching one or more copies of a templated hardware command into executable hardware commands, the one or more copies of the templated hardware command corresponding to the one or more tiles, respectively, the templated hardware command being part of the metadata and corresponding to the first computation operation.
|