US 11,720,783 B2
Multiplication and addition device for matrices, neural network computing device, and method
Tianshi Chen, Pudong New Area (CN); Yimin Zhuang, Pudong New Area (CN); Qi Guo, Pudong New Area (CN); Shaoli Liu, Pudong New Area (CN); and Yunji Chen, Pudong New Area (CN)
Assigned to SHANGHAI CAMBRICON INFORMATION TECHNOLOGY CO., LTD., Pudong New Area (CN)
Filed by Shanghai Cambricon Information Technology Co., Ltd., Pudong New Area (CN)
Filed on Oct. 21, 2019, as Appl. No. 16/658,800.
Application 16/658,800 is a continuation of application No. 16/440,257, filed on Jun. 13, 2019, granted, now 10,509,998.
Application 16/440,257 is a continuation in part of application No. PCT/CN2017/116456, filed on Dec. 15, 2017.
Claims priority of application No. 201611185917.9 (CN), filed on Dec. 20, 2016.
Prior Publication US 2020/0050927 A1, Feb. 13, 2020
Int. Cl. G06N 3/063 (2023.01); G06F 7/544 (2006.01); G06F 17/16 (2006.01); G06N 3/04 (2023.01); G06N 3/06 (2006.01)
CPC G06N 3/063 (2013.01) [G06F 7/5443 (2013.01); G06F 17/16 (2013.01); G06N 3/04 (2013.01); G06N 3/06 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A neural network operation device, comprising:
a submatrix divider circuit configured to select a portion of an input data matrix as an input submatrix;
a matrix element memory configured to:
receive a convolution kernel matrix that includes one or more kernel values,
wherein each of the one or more kernel values is represented as a sequence that includes one or more bits,
wherein the one or more bits that represent each of the kernel values include a sign bit,
wherein the sign bits of the kernel values are stored in a sign storage space, and
respectively store the one or more bits in one or more storage spaces in accordance with positions of the one or more bits in the sequence;
a calculator circuit configured to calculate an intermediate result for each storage space based on one or more input elements in the input submatrix, wherein the one or more input elements correspond to non-zero values stored in the storage space;
an accumulator circuit configured to sum the intermediate results to generate an output value;
a convolution result assembler circuit configured to assemble the output values calculated for different portions of the input data matrix to generate an output matrix; and
a symbol calculator circuit configured to perform an exclusive disjunction operation between signs of the one or more input elements and the sign bits stored in the sign bits stored in the sign storage space to generate a binary result sequence.