US 11,841,816 B2
Network-on-chip data processing method and device
Yao Zhang, Pudong New Area (CN); Shaoli Liu, Pudong New Area (CN); Jun Liang, Pudong New Area (CN); and Yu Chen, Pudong New Area (CN)
Assigned to SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD., Pudong New Area (CN)
Filed by Shanghai Cambricon Information Technology Co., Ltd., Pudong New Area (CN)
Filed on Dec. 29, 2021, as Appl. No. 17/564,389.
Application 17/564,389 is a continuation of application No. 17/278,812, previously published as PCT/CN2019/111977, filed on Oct. 18, 2019.
Claims priority of application No. 201811215820.7 (CN), filed on Oct. 18, 2018; application No. 201811215978.4 (CN), filed on Oct. 18, 2018; application No. 201811216718.9 (CN), filed on Oct. 18, 2018; application No. 201811216857.1 (CN), filed on Oct. 18, 2018; application No. 201811390409.3 (CN), filed on Nov. 21, 2018; application No. 201811390428.6 (CN), filed on Nov. 21, 2018; application No. 201811392232.0 (CN), filed on Nov. 21, 2018; application No. 201811392262.1 (CN), filed on Nov. 21, 2018; application No. 201811392270.6 (CN), filed on Nov. 21, 2018; application No. 201811392279.7 (CN), filed on Nov. 21, 2018; and application No. 201811393352.2 (CN), filed on Nov. 21, 2018.
Prior Publication US 2022/0138138 A1, May 5, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 13/40 (2006.01); G06N 3/04 (2023.01)
CPC G06F 13/4068 (2013.01) [G06N 3/04 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A network-on-chip (NoC) processing system, comprising a storage device and a plurality of computation device clusters, wherein the storage device and the plurality of computation device clusters are arranged on a same chip, each computation device cluster includes a plurality of computation devices, wherein at least one of the plurality of computation device clusters is connected to the storage device, and at least two computation device clusters are connected to each other,
wherein at least one computation device of the plurality of computation device clusters is configured to perform a machine learning computation, and the computation device includes an operation unit and a controller unit, wherein the operation unit includes a primary processing circuit and a plurality of secondary processing circuits, wherein
the controller unit is configured to obtain input data and a computation instruction;
the controller unit is further configured to parse the computation instruction to obtain a plurality of operation instructions, and send the plurality of operation instructions and the input data to the primary processing circuit;
the primary processing circuit is configured to perform preorder processing on the input data, and send the data and the operation instructions among the primary processing circuit and the plurality of secondary processing circuits;
the plurality of secondary processing circuits are configured to perform intermediate computations in parallel according to the data and the operation instructions sent by the primary processing circuit to obtain a plurality of intermediate results, and send the plurality of intermediate results to the primary processing circuit; and
the primary processing circuit is further configured to perform postorder processing on the plurality of intermediate results to obtain a computation result of the computation instruction.