CPC G06N 3/063 (2013.01) [G06F 15/8046 (2013.01); G06N 3/045 (2023.01); G06N 3/08 (2013.01); G06N 5/04 (2013.01)] | 20 Claims |
1. A method for computing neural network inferences using a special-purpose processor comprising a hardware integrated circuit configured to implement the neural network, the method comprising:
receiving, at the special-purpose processor:
i) a kernel weight matrix comprising weights for a layer of the neural network and
ii) activations provided as inputs to the layer;
loading the kernel weight matrix into a matrix unit of the special-purpose processor by shifting the weights for the layer along a first dimension of the matrix unit;
loading the activations into the matrix unit by shifting activations representing inputs to the layer along a second dimension of the matrix unit;
computing, at the matrix unit, accumulated values from convolutions executed on the weights and activations shifted along dimensions of the matrix unit; and
generating an output for the layer based on the accumulated values.
|