US 11,922,316 B2
Training a neural network using periodic sampling over model weights
Samarth Tripathi, Mountain View, CA (US); Jiayi Liu, Fremont, CA (US); Unmesh Kurup, Sunnyvale, CA (US); and Mohak Shah, Dublin, CA (US)
Assigned to LG ELECTRONICS INC., Seoul (KR)
Filed by LG ELECTRONICS INC., Seoul (KR)
Filed on Aug. 13, 2020, as Appl. No. 16/993,147.
Claims priority of provisional application 62/915,032, filed on Oct. 15, 2019.
Prior Publication US 2021/0110274 A1, Apr. 15, 2021
Int. Cl. G06N 3/084 (2023.01); G06F 17/18 (2006.01); G06F 18/10 (2023.01); G06N 5/046 (2023.01)
CPC G06N 3/084 (2013.01) [G06F 17/18 (2013.01); G06F 18/10 (2023.01); G06N 5/046 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A computer-implemented method for training a neural network, the computer-implemented method comprising:
initializing one or more model parameters for training the neural network;
performing a forward pass and back propagation for a minibatch of training data comprising a plurality of batches of training data;
determining a new weight value for each of a plurality of nodes of the neural network based on an optimization algorithm;
for each determined new weight value, determining whether to update a running mean corresponding to a weight of each node from the plurality of nodes, wherein determining whether to update the running mean is based on:
whether a current batch of training data falls within a predefined last subset of batches of the minibatch; and
performing a random determination having a probability based on a value included in the one or more model parameters;
based on a determination to update the running mean, calculating a new mean weight value for each node using the determined new weight value,
wherein when the current batch of training data does not fall within the predefined last subset of batches of the minibatch, the running mean is not updated using the determined new weight value;
updating weight parameters for all nodes based on the calculated new mean weight values corresponding to each node;
assigning the running mean as the weight for each node when training on K number of minibatches is completed, wherein K is a predefined number; and
reinitializing running means for all nodes in the neural network at a start of training a K+1 minibatch of the training data.