CPC G06F 9/3877 (2013.01) [G06F 15/80 (2013.01); G06N 3/063 (2013.01); G06N 5/046 (2013.01)] | 16 Claims |
1. A neural network inference circuit for executing a neural network that comprises a plurality of computation nodes at a plurality of layers, the neural network inference circuit comprising:
a plurality of core circuits comprising memories for storing input values for the computation nodes of the neural network;
a set of post-processing circuits for computing output values of the computation nodes of the neural network, wherein output values of computation nodes of a first layer of the neural network are for storage in the memories of the plurality of core circuits as input values for a second layer of the neural network; and
an output bus, comprising a plurality of lanes, that connects the set of post-processing circuits to the plurality of core circuits, the output bus for (i) receiving a set of output values from the set of post-processing circuits, (ii) transporting the set of output values to the plurality of core circuits based on configuration data specifying a core circuit at which each output value of the set of output values is to be stored, and (iii) aligning the set of output values for storage in the plurality of core circuits,
wherein:
the plurality of lanes of the output bus are ordered and indexed, each lane corresponding to a set of post-processing units;
for a particular clock cycle, each lane receives at most one computed output value from one of the post-processing units of its corresponding set of post-processing units;
a subset of the output values that are transported to a particular core circuit that receives output values in the particular clock cycle are transported on contiguous lanes of the output bus; and
the output bus aligns the subset of output values that are transported to the particular core circuit by shifting the subset of output values by an amount based on a lowest index of the contiguous lanes.
|