US 11,972,347 B2
	System and method for emulating quantization noise for a neural network
Chaim Baskin, Kiryat Motzkin (IL); Eliyahu Schwartz, Tel-Aviv (IL); Evgenii Zheltonozhskii, Kiryat Motzkin (IL); Alexander Bronstein, Haifa (IL); Natan Liss, Haifa (IL); and Abraham Mendelson, Haifa (IL)
Assigned to Technion Research & Development Foundation Limited, Haifa (IL); and Ramot at Tel-Aviv University Ltd., Tel-Aviv (IL)
Appl. No. 17/049,651
Filed by Technion Research & Development Foundation Limited, Haifa (IL); and Ramot at Tel-Aviv University Ltd., Tel-Aviv (IL)
PCT Filed Apr. 22, 2019, PCT No. PCT/IL2019/050457 § 371(c)(1), (2) Date Oct. 22, 2020, PCT Pub. No. WO2019/207581, PCT Pub. Date Oct. 31, 2019.
Claims priority of provisional application 62/661,016, filed on Apr. 22, 2018.
Prior Publication US 2021/0241096 A1, Aug. 5, 2021
Int. Cl. G06N 3/08 (2023.01); G06F 18/211 (2023.01); G06F 18/23 (2023.01); G06N 3/048 (2023.01)

CPC G06N 3/08 (2013.01) [G06F 18/211 (2023.01); G06F 18/23 (2023.01); G06N 3/048 (2023.01)]

16 Claims

1. A system for training a classification system's quantized neural network dataset, comprising at least one hardware processor adapted to:

receive digital input data comprising a plurality of training input value sets and a plurality of target value sets;

in each training iteration of a plurality of training iterations:

for each layer, comprising a plurality of weight values, of one or more layers of a plurality of layers of a neural network:

compute a set of transformed values by applying to a plurality of layer values, comprising a plurality of previous layer output values of a previous layer and the layer's plurality of weight values, one or more emulated non-uniformly quantized transformations by adding to each value of the plurality of layer values one or more uniformly distributed random noise values; and

compute a plurality of layer output values by applying to the set of transformed values one or more arithmetic operations;

compute a plurality of training output values from a combination of the plurality of layer output values of a last layer of the plurality of layers; and

update one or more of the plurality of weight values of the one or more layers to decrease a value of a loss function computed using the plurality of target value sets and plurality of training output values; and

output the updated plurality of weight values of the plurality of layers;

wherein the at least one hardware processor applies the one or more emulated non-uniformly quantized transformations to the plurality of layer values to compute a set of transformed values by:

applying to each previous layer output value of the plurality of previous layer output values a first emulated non-uniformly quantized transformation by adding a first uniformly distributed random noise value, having a first distribution having a first variance, to produce a set of transformed output values;

applying to each weight value of the layer's plurality of weight values a second emulated non-uniformly quantized transformation by adding a second uniformly distributed random noise value, having a second distribution having a second variance, to produce a set of transformed weight values; and

combining the set of transformed output values with the set of transformed weight values to produce the set of transformed values.