US 11,915,126 B2
	Low power hardware architecture for a convolutional neural network
Jian hui Huang, Los Altos, CA (US); James Michael Bodwin, Cupertino, CA (US); Pradeep R. Joginipally, San Jose, CA (US); Shabarivas Abhiram, Mountain view, CA (US); Gary S. Goldman, Los Altos, CA (US); Martin Stefan Patz, Bavaria (DE); Eugene M. Feinberg, San Jose, CA (US); and Berend Ozceri, Los Gatos, CA (US)
Assigned to Recogni Inc., San Jose, CA (US)
Filed by Recogni Inc., San Jose, CA (US)
Filed on Sep. 4, 2020, as Appl. No. 16/948,164.
Prior Publication US 2022/0076104 A1, Mar. 10, 2022
Int. Cl. G06N 3/04 (2023.01); G06N 3/06 (2006.01); G06N 3/063 (2023.01); G06F 7/50 (2006.01); G06F 7/544 (2006.01); G06N 3/0464 (2023.01)

CPC G06N 3/063 (2013.01) [G06F 7/50 (2013.01); G06F 7/5443 (2013.01); G06N 3/0464 (2023.01)]

5 Claims

4. A system, comprising:

a first computing unit configured to:

receive a first plurality of quantized activation values represented by a first plurality of activation mantissa values and a first activation exponent shared by the first plurality of activation mantissa values, wherein the first plurality of quantized activation values is a quantized representation of a first matrix with values

receive a first quantized convolutional kernel represented by a first plurality of kernel mantissa values and a first kernel exponent shared by the first plurality of kernel mantissa values;

compute a first dot product of the first plurality of activation mantissa values and the first plurality of kernel mantissa values; and

compute a first sum of the first shared activation exponent and the first shared kernel exponent; and

a second computing unit configured to:

receive a second plurality of quantized activation values represented by a second plurality of activation mantissa values and a second activation exponent shared by the second plurality of activation mantissa values, wherein the second plurality of quantized activation values is a quantized representation of a second matrix with values

wherein six of the values of the first matrix are identical to six of the values of the second matrix;

receive the first quantized convolutional kernel;

compute a second dot product of the second plurality of activation mantissa values and the first plurality of kernel mantissa values; and

compute a second sum of the second shared activation exponent and the first shared kernel exponent.