US 11,748,604 B2
Integrated circuit chip device
Shaoli Liu, Beijing (CN); Xinkai Song, Beijing (CN); Bingrui Wang, Beijing (CN); Yao Zhang, Beijing (CN); and Shuai Hu, Beijing (CN)
Assigned to CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed by CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed on Dec. 27, 2020, as Appl. No. 17/134,486.
Application 17/134,486 is a continuation of application No. 16/903,304, filed on Jun. 16, 2020.
Application 16/903,304 is a continuation of application No. PCT/CN2018/123929, filed on Dec. 26, 2018.
Claims priority of application No. 201711455388.4 (CN), filed on Dec. 27, 2017; application No. 201711455397.3 (CN), filed on Dec. 27, 2017; application No. 201711466943.3 (CN), filed on Dec. 28, 2017; application No. 201711468629.9 (CN), filed on Dec. 28, 2017; application No. 201711469408.3 (CN), filed on Dec. 28, 2017; application No. 201711469614.4 (CN), filed on Dec. 28, 2017; and application No. 201711469615.9 (CN), filed on Dec. 28, 2017.
Prior Publication US 2021/0117766 A1, Apr. 22, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/063 (2023.01); G06N 3/04 (2023.01)
CPC G06N 3/063 (2013.01) [G06N 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An integrated circuit chip device configured to perform neural network forward computations, wherein the neural network has n layers, and the integrated circuit chip device comprises:
a main processing circuit; and
a plurality of basic processing circuits,
wherein:
the main processing circuit comprises a data type conversion circuit configured to convert data between a floating point data type and a fixed point data type;
the plurality of basic processing circuits are arranged as an array, each basic processing circuit is connected to an adjacent basic processing circuit, the main processing circuit is connected to a first quantity of basic processing circuits in a first row, the first quantity of basic processing circuits in an mth row, and m basic processing circuits in a first column;
the main processing circuit is configured to:
receive a first operation instruction; and
parse the first operation instruction to obtain a first computation instruction included in an ith layer of the forward computations of the first operation instruction and corresponding input data and weight data of the first operation instruction, wherein:
i is an integer greater than or equal to 1 and less than or equal to n, and if i is greater than or equal to 2, the input data is output data of an i−1th layer;
the main processing circuit is configured to:
determine a first complexity of the first computation instruction according to the input data, the weight data, and the first computation instruction;
determine a first data type corresponding to the first computation instruction according to the first complexity; and
determine whether to start the data type conversion circuit according to the first complexity, wherein:
the first data type is a floating point data type or a fixed point data type;
the main processing circuit is further configured to:
classify the input data of the first data type and the weight data of the first data type into a broadcasting data block and a distribution data block according to a type of the first computation instruction;
partition the distribution data block to obtain a plurality of basic data blocks;
distribute the plurality of basic data blocks to the plurality of basic processing circuits connected to the main processing circuit; and
broadcast the broadcasting data block to the basic processing circuits connected to the main processing circuit;
at least one of the plurality of basic processing circuits is configured to:
perform computations on the broadcasting data block and the basic data blocks of the first data type in parallel to obtain computation results; and
transfer the computation results to the main processing circuit through the basic processing circuits connected to the main processing circuit; and
the main processing circuit is configured to:
process the computation results to obtain an instruction result of the first computation instruction so that computations of the first computation instruction of the ith layer are completed.