US 12,346,792 B2
	Accelerated training of neural networks with regularization links
James K. Baker, Maitland, FL (US); and Bradley J. Baker, Berwyn, PA (US)
Assigned to D5AI LLC, Maitland, FL (US)
Filed by D5AI LLC, Maitland, FL (US)
Filed on Jan. 3, 2025, as Appl. No. 19/009,560.
Application 18/587,242 is a division of application No. 18/327,527, filed on Jun. 1, 2023, granted, now 11,948,063, issued on Apr. 2, 2024.
Application 19/009,560 is a continuation of application No. 18/587,242, filed on Feb. 26, 2024, granted, now 12,205,010.
Application 18/327,527 is a continuation of application No. 17/387,211, filed on Jul. 28, 2021, granted, now 11,836,600, issued on Dec. 5, 2023.
Claims priority of provisional application 63/068,080, filed on Aug. 20, 2020.
Prior Publication US 2025/0139409 A1, May 1, 2025
Int. Cl. G06N 3/045 (2023.01)

CPC G06N 3/045 (2023.01)

36 Claims

1. A method to accelerate training of a first neural network architecture, wherein the first neural network architecture comprises layers k=0, . . . K, where K is greater than or equal to two, wherein the Kth layer is an output layer of the first neural network architecture and the 0th layer is an input layer of the first neural network architecture, and wherein each of the k layers comprises one or more nodes, such that the first neural network architecture comprises at least a node P on one of the layers, the method comprising:

assigning, by a programmed computer system weights to connections in a second neural network architecture, the second neural network architecture comprising layers I=0, . . . , L, where Lis greater than two, wherein the Lth layer is an output layer of the second neural network architecture and the 0^thlayer is an input layer of the second neural network architecture, and wherein each of the L layers comprises one or more nodes, including a node R on a layer of the second neural network architecture other than the 0th layer, wherein the connections in the second neural network architecture comprise incoming connections to nodes on layers 1=1, . . . , L; and

after assigning the weights to the connections in the second neural network architecture:

storing, in a memory of the programmed computer system, preliminary activation values for the node R in the second neural network architecture for a training datum in a training data set; and

training, by the programmed computer system, via machine learning, the first neural network architecture, wherein training the first neural network architecture comprises imposing, by the programmed computer system, a regularization link between the node R of the second neural network architecture and the node P of the first neural network architecture, wherein imposing the regularization link comprises adding, during back-propagation of partial derivatives through the first neural network architecture for the training datum, a regularization cost to a loss function for the first neural network architecture for node P upon a specified binary relationship between a stored preliminary activation value for node R for the training datum and an activation value for node P for the training datum not being satisfied.