US 12,106,222 B2
Neural network training under memory restraint
Sudipta Sengupta, Sammamish, WA (US); Randy Renfu Huang, Morgan Hill, CA (US); Ron Diamant, Santa Clara, CA (US); and Vignesh Vivekraja, Santa Clara, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Feb. 21, 2023, as Appl. No. 18/112,036.
Application 18/112,036 is a division of application No. 16/836,421, filed on Mar. 31, 2020, granted, now 11,610,128.
Prior Publication US 2023/0196113 A1, Jun. 22, 2023
Int. Cl. G06N 3/084 (2023.01); G06N 3/04 (2023.01)
CPC G06N 3/084 (2013.01) [G06N 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An apparatus comprising:
a memory that stores instructions; and
a hardware processor configured to execute the instructions to:
control a neural network processor to perform a loss gradient operation to generate data gradients;
after the loss gradient operation completes, control the neural network processor to perform a forward propagation operation to regenerate intermediate outputs, the intermediate outputs having been previously generated by the neural network processor;
control the neural network processor to perform a backward propagation operation based on the data gradients and the intermediate outputs to generate weight gradients;
receive the weight gradients from the neural network processor; and
update weights of a neural network based on the weight gradients.