US 12,217,158 B2
	Neural network circuitry having floating point format with asymmetric range
Xiao Sun, Pleasantville, NY (US); Jungwook Choi, Seoul (KR); Naigang Wang, Ossining, NY (US); Chia-Yu Chen, White Plains, NY (US); and Kailash Gopalakrishnan, New York, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Sep. 3, 2019, as Appl. No. 16/558,554.
Prior Publication US 2021/0064976 A1, Mar. 4, 2021
Int. Cl. G06N 3/063 (2023.01); G06N 3/084 (2023.01)

CPC G06N 3/063 (2013.01) [G06N 3/084 (2013.01)]

25 Claims

1. An apparatus, comprising:

control circuitry configured to perform neural network operations on floating point numbers during a neural network training process which comprises forward propagation neural network operations and backward propagation neural network operations, wherein the control circuitry comprises floating point unit circuitry, wherein the control circuitry is configured to:

selectively configure the floating point unit circuitry during the neural network training process to operate on floating point numbers with different n-bit floating point formats for the forward propagation neural network operations and the backward propagation neural network operations, the different n-bit floating point formats having a same number n of bits, but different formats with respect to exponent bits and mantissa bits, wherein in selectively configuring the floating point unit circuitry, the control circuitry is configured to:

selectively configure the floating point unit circuitry during the neural network training process to operate using a first n-bit floating point format to perform the forward propagation neural network operations on floating point numbers having the first n-bit floating point format, the first n-bit floating point format having a configuration consisting of a sign bit, m exponent bits and p mantissa bits where m is greater than p;

selectively configure the floating point unit circuitry during the neural network training process to operate using a second n-bit floating point format to perform the backward propagation neural network operations on floating point numbers having the second n-bit floating point format that is different than the first n-bit floating point format, the second n-bit floating point format having a configuration consisting of a sign bit, q exponent bits and r mantissa bits where q is greater than m and r is less than p; and

selectively convert computation results, which are generated and output from the floating point unit circuitry as a result of performing the forward and backward propagation neural network operations, into one of the first n-bit floating point format and the second n-bit floating point format for further processing by the floating point unit circuitry in a next iteration of the neural network training process.