US 12,223,288 B2
	Neural network processing unit including approximate multiplier and system on chip including the same
Jun-seok Park, Hwaseong-si (KR)
Assigned to Samsung Electronics Co., Ltd., Gyeonggi-do (KR)
Filed by Samsung Electronics Co., Ltd., Suwon-si (KR)
Filed on Jan. 3, 2019, as Appl. No. 16/239,046.
Claims priority of application No. 10-2018-0002776 (KR), filed on Jan. 9, 2018.
Prior Publication US 2019/0212981 A1, Jul. 11, 2019
Int. Cl. G06F 7/487 (2006.01); G06F 7/499 (2006.01); G06F 7/544 (2006.01); G06N 3/04 (2023.01); G06N 3/063 (2023.01)

CPC G06F 7/4876 (2013.01) [G06F 7/49947 (2013.01); G06F 7/5443 (2013.01); G06N 3/04 (2013.01); G06N 3/063 (2013.01)]

17 Claims

1. A neural network processing unit configured to perform a computation based on one or more instances of input data and a plurality of weights, the neural network processing unit comprising:

processing circuitry configured to output at least a first control signal and a second control signal to at least one neural processor (NPU), the first and second control signals respectively configured to enable a selection by the at least one NPU between a training mode and an inference mode; and

a plurality of neural processors (NPUs) configured to implement a neural network, and including the at least one NPU,

wherein the at least one NPU of the plurality of NPUs is configured to

switch to a fixed-point approximate multiplication training mode in response to receiving the first control signal,

receive a first value and a second value while in the fixed-point approximate multiplication training mode,

perform a fixed-point approximate multiplication operation based on the first value and the second value in response to receiving the first value and the second value while in the fixed-point approximate multiplication training mode,

review an output value for a loss of accuracy, the output value including a result of the fixed-point approximate multiplication operation and the review including

performing a stochastic rounding operation on the output value, and

determining, based on a result of the stochastic rounding operation, the loss of accuracy for the output value, and

train the neural network by tuning a parameter of the at least one NPU based on the determined loss, and

wherein the at least one NPU of the plurality of NPUs is configured to

select a general multiplication inference mode in response to receiving the second control signal,

receive an input value while in the general multiplication inference mode, and

perform a general multiplication operation based on the input value and the tuned parameter in response to receiving the input value while in the general multiplication inference mode.