US 12,223,300 B2
Deep learning model inference for dynamic input shapes
Meng-Hsuan Yang, Hsinchu (TW); Po-hua Huang, Hsinchu (TW); Hsing-Chang Chou, Hsinchu (TW); Ting Chen Tsan, Hsinchu (TW); and Yu-Lung Lu, Hsinchu (TW)
Assigned to MEDIATEK INC., Hsinchu (TW)
Filed by MEDIATEK INC., Hsinchu (TW)
Filed on Apr. 28, 2023, as Appl. No. 18/309,341.
Prior Publication US 2024/0361999 A1, Oct. 31, 2024
Int. Cl. G06F 8/41 (2018.01); G06N 3/10 (2006.01)
CPC G06F 8/443 (2013.01) [G06N 3/10 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of compiling a deep learning model, comprising:
reading metadata from a compiled result, the metadata being extracted from a low-level intermediate representation (IR) generated from a deep learning compilation process for compiling the deep learning model, the metadata indicating a structure of the deep learning model corresponding to the low-level IR, the structure including computation operations and connections connecting the computation operations;
receiving shape information of an input tensor of the deep learning model;
determining a shape of an output tensor of a first computation operation of the computation operations based on the shape information of the input tensor of the deep learning model and the structure of the deep learning model;
tiling the output tensor of the first computation operation into one or more tiles according to the shape of the output tensor of the first computation operation and hardware limitations of a processor executing the deep learning model; and
patching one or more copies of a templated hardware command into executable hardware commands, the one or more copies of the templated hardware command corresponding to the one or more tiles, respectively, the templated hardware command being part of the metadata and corresponding to the first computation operation.