US 12,217,179 B2
	Intelligent regularization of neural network architectures
Zoubin Ghahramani, Cambridge (GB); Douglas Bemis, San Francisco, CA (US); and Theofanis Karaletsos, San Francisco, CA (US)
Assigned to Uber Technologies, Inc., San Francisco, CA (US)
Filed by Uber Technologies, Inc., San Francisco, CA (US)
Filed on Sep. 25, 2023, as Appl. No. 18/372,415.
Application 18/372,415 is a continuation of application No. 17/513,517, filed on Oct. 28, 2021, granted, now 11,829,876.
Application 17/513,517 is a continuation of application No. 15/789,898, filed on Oct. 20, 2017, granted, now 11,164,076, issued on Nov. 2, 2021.
Claims priority of provisional application 62/451,818, filed on Jan. 30, 2017.
Claims priority of provisional application 62/410,393, filed on Oct. 20, 2016.
Prior Publication US 2024/0013049 A1, Jan. 11, 2024
Int. Cl. G06N 3/08 (2023.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06N 3/082 (2023.01); G06N 3/0985 (2023.01); G06N 7/01 (2023.01)

CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01); G06N 3/082 (2013.01); G06N 3/0985 (2023.01); G06N 7/01 (2023.01)]

20 Claims

1. A non-transitory computer-readable medium storing instructions that, when executed by a computing system, cause the computing system to perform operations comprising:

receiving a set of direct inputs;

providing the set of direct inputs to a direct network having a set of weights, wherein the set of weights were trained by a process including:

setting initial values for the set of weights;

processing training input using the initial values to generate training output;

obtaining distributions of a set of expected weights for the direct network, the distributions of set of expected weights generated by an indirect network using a set of indirect parameters, wherein at least some of the set of indirect parameters for a node describe a location of the node within the direct network by specifying a layer and an index of the node;

identifying an error between an expected output and the training output generated from the direct network; and

updating the set of weights based on the error and the distributions of the set of expected weights; and

generating, by the direct network, a direct output from the direct network, using the set of weights.