US 11,748,622 B1
Saving intermediate outputs of a neural network
Drazen Borkovic, Los Altos, CA (US); and Se jong Oh, Sammamish, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 4, 2019, as Appl. No. 16/292,236.
Int. Cl. G06N 3/082 (2023.01); G06F 16/901 (2019.01); G06N 3/063 (2023.01); G06F 8/41 (2018.01)
CPC G06N 3/082 (2013.01) [G06F 8/433 (2013.01); G06F 16/9024 (2019.01); G06N 3/063 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
receiving, by a compiler executing on a computing device, a request for access to an intermediate output of a neural network, the request identifying a particular layer of the neural network that produces the intermediate output;
retrieving, by the compiler, a data flow graph for the particular layer, the data flow graph comprising a plurality of nodes, each node representing an operation of the neural network to be executed by an integrated circuit device, wherein the nodes are interconnected by connections indicating a sequence in which the operations represented by the nodes are to be executed;
determining, by the compiler, that the intermediate output corresponds to an output of a first node of the plurality of nodes;
determining, by the compiler, a location to insert an additional node into the data flow graph, wherein the additional node represents a save operation that saves the output of the first node;
augmenting, by the compiler, the data flow graph, the augmenting comprising, while maintaining existing nodes and connections in the data flow graph:
inserting the additional node at the determined location;
creating a first connection from the first node to the additional node;
creating a second connection from the additional node to a second node of the plurality of nodes based on a result of a dependency check, the second connection representing a data dependency between the save operation and a second operation represented by the second node;
creating a third connection from the additional node to a third node of the plurality of nodes based on the result of the dependency check, the third connection representing a data dependency between the save operation and a third operation represented by the third node; and
removing the second connection from the augmented data flow graph based on the determining that the second connection is redundant; and
converting, by the compiler, the augmented data flow graph into machine instructions executable by the integrated circuit device.