US 12,001,936 B2
Lossless tiling in convolution networks—graph metadata generation
Tejas Nagendra Babu Nama, Sunnyvale, CA (US); Ruddhi Chaphekar, Santa Clara, CA (US); Ram Sivaramakrishnan, San Jose, CA (US); Raghu Prabhakar, San Jose, CA (US); Sumti Jairath, Santa Clara, CA (US); Junjue Wang, San Mateo, CA (US); Kaizhao Liang, Palo Alto, CA (US); Adi Fuchs, West Windsor, NJ (US); Matheen Musaddiq, Austin, TX (US); and Arvind Krishna Sujeeth, San Francisco, CA (US)
Assigned to SambaNova Systems, Inc., Palo Alto, CA (US)
Filed by SambaNova Systems, Inc., Palo Alto, CA (US)
Filed on Mar. 21, 2022, as Appl. No. 17/700,452.
Application 17/364,110 is a division of application No. 17/216,651, filed on Mar. 29, 2021, granted, now 11,195,080, issued on Dec. 7, 2021.
Application 17/700,452 is a continuation of application No. 17/364,110, filed on Jun. 30, 2021.
Prior Publication US 2022/0309324 A1, Sep. 29, 2022
Int. Cl. G06N 3/04 (2023.01)
CPC G06N 3/04 (2013.01) 22 Claims
OG exemplary drawing
 
1. One or more non-transitory computer readable storage media storing computer program instructions, that if executed on a processor, implement a method comprising:
obtaining a processing graph of an application, the processing graph having a sequence of processing nodes, the sequence of processing nodes including an input processing node followed by an intermediate processing node and an output processing node, the input processing node configured to process an input and generate an intermediate representation of the input, the intermediate processing node configured to process the intermediate representation and generate at a further intermediate representation of the input, and the output processing node configured to process the further intermediate representation and generate an output representation of the input;
generating graph metadata that specifies a target tiling configuration for the output representation to tile the output representation into a set of non-overlapping tiles, a first tiling configuration for the input to tile the input into a first set of overlapping tiles, a second tiling configuration for the intermediate representation to tile the intermediate representation into a second set of overlapping tiles, and a third tiling configuration for the further intermediate representation to tile the further intermediate representation into a third set of tiles;
modifying the processing graph based on the graph metadata to generate a modified processing graph, wherein the modified processing graph is configured to generate (a) the first set of overlapping tiles in the first tiling configuration, (b) the second set of overlapping tiles in the second tiling configuration by using the first set of overlapping tiles as a first set of tile-by-tile effective receptive fields for the input processing node, (c) the third set of tiles in the third tiling configuration by using the second set of overlapping tiles as a second set of tile-by-tile effective receptive fields for the intermediate processing node, and (d) the set of non-overlapping tiles in the target tiling configuration by using the third set of tiles as a third set of tile-by-tile effective receptive fields for the output processing node; and
creating a set of computer instructions to execute the modified processing graph on a target processing system.