US 12,437,180 B2
System and method for modifying integer and fractional portion sizes of a parameter of a neural network
Pierre Demaj, Nice (FR); and Laurent Folliot, Gourdon (FR)
Assigned to STMicroelectronics (Rousset) SAS, Rousset (FR)
Filed by STMicroelectronics (Rousset) SAS, Rousset (FR)
Filed on Mar. 5, 2020, as Appl. No. 16/810,582.
Claims priority of application No. 1902853 (FR), filed on Mar. 20, 2019.
Prior Publication US 2020/0302266 A1, Sep. 24, 2020
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/04 (2013.01) [G06N 3/08 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A method comprising:
analyzing a set of initial parameters defining an initial multilayer neural network, the analyzing comprising reducing a size of at least one initial parameter of each layer of the initial multilayer neural network to obtain for each layer a set of new parameters defining a new neural network, each new parameter of the set of new parameters having its data represented in fixed point format having a fixed number of decimal places, with an integer portion corresponding to an integer number to a left of a decimal point, a fractional portion corresponding to a fractional number to a right of the decimal point, and a sign bit;
implementing the new neural network using a test input data set applied only once to each layer to generate an output data matrix for each layer;
determining a distribution function or a density function for each layer based on the output data matrix for each layer;
using the determined distribution function or density function to select, according to a compromise between a risk of saturation and a risk of loss of precision, either:
an increase in a size of a first memory area allocated to the fractional portion and a reduction in a size of a second memory area allocated to the integer portion of each new parameter associated with each layer, or
a reduction in the size of the first memory area allocated to the fractional portion and an increase in the size of the second memory area allocated to the integer portion of each new parameter associated with each layer; and
increasing the size of the first memory area allocated to the fractional portion and reducing the size of the second memory area allocated to the integer portion of each new parameter associated with each layer.