CPC G06N 3/04 (2013.01) [G06F 17/16 (2013.01)] | 18 Claims |
1. A method of processing layers in a neural network, the method comprising:
obtaining a plurality of Input Feature Map (IFM) tiles of at least one IFM tensor and a plurality of kernel tiles of at least one kernel tensor from a memory;
performing, by an accelerator, a convolutional operation on the plurality of IFM tiles and the plurality of kernel tiles based on IFM sparsity and kernel sparsity;
generating, by the accelerator, a plurality of partial Output Feature Map (OFM) tiles; and
generating, by the accelerator, a plurality of OFM tiles corresponding to the plurality of IFM tiles using the plurality of partial OFM tiles.
|