CPC G06F 15/7825 (2013.01) [G06N 3/063 (2013.01)] | 20 Claims |
1. A neural network (NN) accelerator with a multi-layer networks-on-chip (NoCs) architecture, comprising:
a plurality of cores and a central processing unit (CPU), wherein each core comprises a plurality of processing entity (PE) clusters, and the plurality of cores and the CPU are respectively coupled to memories,
a data exchange interface for connecting a host device to the NN accelerator,
an outer-layer NoC,
a middle-layer NoC, and
an inner-layer NoC,
wherein:
the outer-layer NoC is configured to transfer data between the host device and the memories, and comprises a bi-directional ring-shape data link connected to the data exchange interface and the memories,
the middle-layer NoC is configured to transfer data among the plurality of cores, and comprises a pair of uni-directional ring-shape data links, each uni-directional ring-shape data link comprising a subset of the plurality of cores; and
the inner-layer NoC is within each core and configured to perform data casting among the plurality of PE clusters within the core for implementing matrix operations, and comprises a cross-bar network connecting a global buffer of the core to the plurality of PE clusters within the core.
|