| CPC G06N 3/04 (2013.01) [G06F 8/41 (2013.01); G06F 17/15 (2013.01); G06N 3/063 (2013.01)] | 20 Claims |

|
1. A method, comprising:
partitioning an input tensor of a convolution of a neural network into tiles;
partitioning a tile into a plurality of micro-tiles;
performing an iteration of the convolution on the tile, wherein performing the iteration of the convolution comprises:
performing, by a group of convolution engines, multiply-accumulate (MAC) operations on the plurality of micro-tiles of the tile in parallel, different micro-tiles processed by different convolution engines in the group; and
generating an output tensor of the convolution based on the iteration of the convolution.
|