US 11,748,604 B2
	Integrated circuit chip device
Shaoli Liu, Beijing (CN); Xinkai Song, Beijing (CN); Bingrui Wang, Beijing (CN); Yao Zhang, Beijing (CN); and Shuai Hu, Beijing (CN)
Assigned to CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed by CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed on Dec. 27, 2020, as Appl. No. 17/134,486.
Application 17/134,486 is a continuation of application No. 16/903,304, filed on Jun. 16, 2020.
Application 16/903,304 is a continuation of application No. PCT/CN2018/123929, filed on Dec. 26, 2018.
Claims priority of application No. 201711455388.4 (CN), filed on Dec. 27, 2017; application No. 201711455397.3 (CN), filed on Dec. 27, 2017; application No. 201711466943.3 (CN), filed on Dec. 28, 2017; application No. 201711468629.9 (CN), filed on Dec. 28, 2017; application No. 201711469408.3 (CN), filed on Dec. 28, 2017; application No. 201711469614.4 (CN), filed on Dec. 28, 2017; and application No. 201711469615.9 (CN), filed on Dec. 28, 2017.
Prior Publication US 2021/0117766 A1, Apr. 22, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/063 (2023.01); G06N 3/04 (2023.01)

CPC G06N 3/063 (2013.01) [G06N 3/04 (2013.01)]

20 Claims

1. An integrated circuit chip device configured to perform neural network forward computations, wherein the neural network has n layers, and the integrated circuit chip device comprises:

a main processing circuit; and

a plurality of basic processing circuits,

wherein:

the main processing circuit comprises a data type conversion circuit configured to convert data between a floating point data type and a fixed point data type;

the plurality of basic processing circuits are arranged as an array, each basic processing circuit is connected to an adjacent basic processing circuit, the main processing circuit is connected to a first quantity of basic processing circuits in a first row, the first quantity of basic processing circuits in an m^throw, and m basic processing circuits in a first column;

the main processing circuit is configured to:

receive a first operation instruction; and

parse the first operation instruction to obtain a first computation instruction included in an i^thlayer of the forward computations of the first operation instruction and corresponding input data and weight data of the first operation instruction, wherein:

i is an integer greater than or equal to 1 and less than or equal to n, and if i is greater than or equal to 2, the input data is output data of an i−1^thlayer;

the main processing circuit is configured to:

determine a first complexity of the first computation instruction according to the input data, the weight data, and the first computation instruction;

determine a first data type corresponding to the first computation instruction according to the first complexity; and

determine whether to start the data type conversion circuit according to the first complexity, wherein:

the first data type is a floating point data type or a fixed point data type;

the main processing circuit is further configured to:

classify the input data of the first data type and the weight data of the first data type into a broadcasting data block and a distribution data block according to a type of the first computation instruction;

partition the distribution data block to obtain a plurality of basic data blocks;

distribute the plurality of basic data blocks to the plurality of basic processing circuits connected to the main processing circuit; and

broadcast the broadcasting data block to the basic processing circuits connected to the main processing circuit;

at least one of the plurality of basic processing circuits is configured to:

perform computations on the broadcasting data block and the basic data blocks of the first data type in parallel to obtain computation results; and

transfer the computation results to the main processing circuit through the basic processing circuits connected to the main processing circuit; and

the main processing circuit is configured to:

process the computation results to obtain an instruction result of the first computation instruction so that computations of the first computation instruction of the i^thlayer are completed.