US 11,748,622 B1 | ||
Saving intermediate outputs of a neural network | ||
Drazen Borkovic, Los Altos, CA (US); and Se jong Oh, Sammamish, WA (US) | ||
Assigned to Amazon Technologies, Inc., Seattle, WA (US) | ||
Filed by Amazon Technologies, Inc., Seattle, WA (US) | ||
Filed on Mar. 4, 2019, as Appl. No. 16/292,236. | ||
Int. Cl. G06N 3/082 (2023.01); G06F 16/901 (2019.01); G06N 3/063 (2023.01); G06F 8/41 (2018.01) |
CPC G06N 3/082 (2013.01) [G06F 8/433 (2013.01); G06F 16/9024 (2019.01); G06N 3/063 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
receiving, by a compiler executing on a computing device, a request for access to an intermediate output of a neural network, the request identifying a particular layer of the neural network that produces the intermediate output; retrieving, by the compiler, a data flow graph for the particular layer, the data flow graph comprising a plurality of nodes, each node representing an operation of the neural network to be executed by an integrated circuit device, wherein the nodes are interconnected by connections indicating a sequence in which the operations represented by the nodes are to be executed; determining, by the compiler, that the intermediate output corresponds to an output of a first node of the plurality of nodes; determining, by the compiler, a location to insert an additional node into the data flow graph, wherein the additional node represents a save operation that saves the output of the first node; augmenting, by the compiler, the data flow graph, the augmenting comprising, while maintaining existing nodes and connections in the data flow graph:
inserting the additional node at the determined location;
creating a first connection from the first node to the additional node;
creating a second connection from the additional node to a second node of the plurality of nodes based on a result of a dependency check, the second connection representing a data dependency between the save operation and a second operation represented by the second node;
creating a third connection from the additional node to a third node of the plurality of nodes based on the result of the dependency check, the third connection representing a data dependency between the save operation and a third operation represented by the third node; and
removing the second connection from the augmented data flow graph based on the determining that the second connection is redundant; and converting, by the compiler, the augmented data flow graph into machine instructions executable by the integrated circuit device. |