US 11,748,601 B2
Integrated circuit chip device
Shaoli Liu, Beijing (CN); Xinkai Song, Beijing (CN); Bingrui Wang, Beijing (CN); Yao Zhang, Beijing (CN); and Shuai Hu, Beijing (CN)
Assigned to CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed by CAMBRICON TECHNOLOGIES CORPORATION LIMITED, Beijing (CN)
Filed on Dec. 27, 2020, as Appl. No. 17/134,444.
Application 17/134,444 is a continuation of application No. 16/903,304, filed on Jun. 16, 2020, granted, now 11,544,546.
Application 16/903,304 is a continuation of application No. PCT/CN2018/123929, filed on Dec. 26, 2018.
Claims priority of application No. 201711455388.4 (CN), filed on Dec. 27, 2017; application No. 201711455397.3 (CN), filed on Dec. 27, 2017; application No. 201711466943.3 (CN), filed on Dec. 28, 2017; application No. 201711468629.9 (CN), filed on Dec. 28, 2017; application No. 201711469408.3 (CN), filed on Dec. 28, 2017; application No. 201711469614.4 (CN), filed on Dec. 28, 2017; and application No. 201711469615.9 (CN), filed on Dec. 28, 2017.
Prior Publication US 2021/0150324 A1, May 20, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/063 (2023.01); G06N 3/04 (2023.01)
CPC G06N 3/063 (2013.01) [G06N 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An integrated circuit chip device for training a neural network having n layers, n being an integer greater than or equal to 2, wherein the integrated circuit chip device comprises:
a main processing circuit; and
a plurality of basic processing circuits;
wherein:
the main processing circuit comprises a data type conversion circuit configured to convert data between a floating point data type and a fixed point data type;
the integrated circuit chip device is configured to:
receive a training instruction;
determine input data and weight group data of a first layer according to the training instruction; and
perform a forward computation of an ith layer of the neural network on the input data and the weight group data of the first layer to obtain an ith output result of the forward computation, i being an integer greater than or equal to 1 and smaller than or equal to n;
the main processing circuit is further configured to:
obtain an ith output result gradient according to the ith output result;
obtain an ith backward computation of backward computations of the ith layer according to the training instruction;
obtain an ith backward computation complexity according to the ith output result gradient, input data of the ith layer, weight group data of the ith layer, and the ith backward computation;
determine an ith back data type corresponding to the ith output result gradient, the input data of the ith layer, and the weight group data of the ith layer according to the ith backward computation complexity; and
classify the ith output result gradient, the input data of the ith layer, and the weight group data of the ith layer into a broadcasting data block and a distribution data block according to a type of the ith backward computation;
at least one of the plurality of basic processing circuits is configured to:
perform computations on the broadcasting data block of the ith back data type and received basic data blocks of the ith back data type to obtain computation results; and
transfer the computation results to the main processing circuit;
the main processing circuit is further configured to:
process the computation results to obtain a weight group gradient of the ith layer and an input data gradient of the ith layer; and
update the weight group data of the ith layer according to the weight group gradient of the ith layer, wherein the ith back data type includes a fixed point type or a floating point type;
the integrated circuit device is further configured to:
perform backward computations of an (i−1)th layer using the input data gradient of the ith layer as an (i−1)th output result gradient of the (i−1)th layer to obtain a weight group gradient of the (i−1)th layer; and
update weight group data of a corresponding layer according to the weight group gradient of the (i−1)th layer, wherein the weight group data includes at least two weights; and
the main processing circuit is further configured to:
when the ith backward computation is a multiplication computation, classify both the input data of the ith layer and the weight group data of the ith layer into distribution data blocks, and the ith output result gradient as a broadcasting data block; and
when the ith backward computation is a convolution computation, classify both the input data of the ith layer and the weight group data of the ith layer into broadcasting data blocks, and the ith output result gradient into a distribution data block.