US 11,748,624 B2
	Evaluating the value of connecting a selected pair of unconnected nodes of a nodal network
James K. Baker, Maitland, FL (US); and Bradley J. Baker, Berwyn, PA (US)
Assigned to D5AI LLC, Maitland, FL (US)
Filed by D5AI LLC, Maitland, FL (US)
Filed on Jul. 15, 2020, as Appl. No. 16/929,900.
Application 16/929,900 is a continuation of application No. 16/767,966, granted, now 11,461,655, previously published as PCT/US2019/015389, filed on Jan. 28, 2019.
Claims priority of provisional application 62/647,085, filed on Mar. 23, 2018.
Claims priority of provisional application 62/623,773, filed on Jan. 30, 2018.
Prior Publication US 2020/0356861 A1, Nov. 12, 2020
Int. Cl. G06N 3/082 (2023.01); G06N 3/045 (2023.01); G06N 5/046 (2023.01); G06N 20/20 (2019.01); G06N 3/084 (2023.01); G06N 3/08 (2023.01); H04L 67/142 (2022.01); G06N 3/04 (2023.01); G06N 20/00 (2019.01); G06F 16/901 (2019.01); G06F 18/24 (2023.01); G06F 18/214 (2023.01); G06F 18/21 (2023.01); G06N 3/048 (2023.01)

CPC G06N 3/082 (2013.01) [G06F 16/9024 (2019.01); G06F 18/214 (2023.01); G06F 18/217 (2023.01); G06F 18/24 (2023.01); G06N 3/04 (2013.01); G06N 3/045 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01); G06N 20/00 (2019.01); G06N 20/20 (2019.01); H04L 67/142 (2013.01); G06F 18/2148 (2023.01); G06N 5/046 (2013.01)]

16 Claims

1. A method of training a neural network, the method comprising:

in an initial training of the neural network, training the neural network, by a computer system, wherein the neural network comprises a plurality of layers, including an input layer, an output layer, and at least one hidden layer, wherein each layer comprises at least one node, such that the neural network comprises a plurality of nodes, including a node A and a node B and a node C, wherein node A is not in the output layer and node B is not in the input layer, such that a weighted output from node A is not input to node B in a feedforward phase through the neural network such that the output from node A is not used in computing an activation value for node B in the feedforward phase, and wherein there is no direct connection from node A to node B in the neural network after the initial training and there is direct connection from node C to node B in the neural network after the initial training, such that a weighted output from node C is input to node B in a feedforward phase through the neural network such that the output from node C is used in computing the activation value for node B in the feedforward phase, and wherein the initial training comprises, for each training data item in a first set of training data:

computing, in the feedforward phase, activation values for nodes of the neural network for the training data item, wherein computing the activation for node B comprises computing the activation value for node B based on a weighted value of the activation value for node C due to the direct connection from node C to node B, where a weight for the activation value for node C used in the computation of the activation value for node B is a weight of the direction connection from node C to node B; and

computing, in a back-propagation phase, estimates of partial derivatives for each of nodes A, B and C with respect to an objective for the neural network, wherein the objective is the same for nodes A, B and C in the initial training;

after the initial training, evaluating, by the computer system, whether to add a direct connection from node A in the neural network to node B in the neural network, such that after adding the direct connection from node A to node B, the activation value of node A, weighted by a connection weight for the direct connection from node A to node B, would be used in computation of the activation value for node B, wherein evaluating whether to add the direct connection from node A to node B comprises estimating, by the computer system, an improvement in the objective of the neural network with the direct connection from node A to node B, wherein estimating the improvement in the objective comprises computing, by the computer system, a value of adding the direct connection from node A to node B, wherein computing the value comprises computing, by the computer system, a sum, over a second set of training data, of products of multiple factors, wherein the multiple factors comprise, for each item in the second set of training data, an activation value for node A and a partial derivative of an error loss function with respect to each input to node B; and

adding, by the computer system, the direct connection from node A to node B upon a determination by the computer system that an outcome of estimating the improvement in the objective of the neural network meets a criterion for adding the direct connection, such that, in a subsequent feedforward phase of a subsequent training, weighted outputs from both node A and node C are used in computing the activation value for node B.