| CPC G06F 15/7825 (2013.01) [G06N 3/063 (2013.01)] | 20 Claims |

|
1. A neural network (NN) accelerator with a multi-layer networks-on-chip (NoCs) architecture, comprising:
a plurality of cores and a central processing unit (CPU), wherein each core comprises a plurality of processing entity (PE) clusters, and the plurality of cores and the CPU are respectively coupled to memories, and
a data exchange interface connecting a host device to the NN accelerator,
an outer-layer NoC,
a middle-layer NoC, and
an inner-layer NoC,
wherein:
the outer-layer NoC is configured to transfer data between the host device and the memories, and comprises a plurality of routers forming a bi-directional ring-shape data link, wherein the plurality of routers comprise a first router connected to the data exchange interface, a second router connected to the CPU and the corresponding memory, and multiple third routers respectively connected to the plurality of cores and the corresponding memories,
the middle-layer NoC is configured to transfer data among the plurality of cores; and
the inner-layer NoC is within each core and configured to perform data casting among the plurality of PE clusters within the core for implementing matrix operations.
|