US 12,093,830 B2
Continuous parametrizations of neural network layer weights
Shahram Izadi, Tiburon, CA (US); and Cem Keskin, San Francisco, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Appl. No. 16/976,805
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Jul. 23, 2019, PCT No. PCT/US2019/042989
§ 371(c)(1), (2) Date Aug. 31, 2020,
PCT Pub. No. WO2020/023483, PCT Pub. Date Jan. 30, 2020.
Claims priority of provisional application 62/702,055, filed on Jul. 23, 2018.
Prior Publication US 2021/0365777 A1, Nov. 25, 2021
Int. Cl. G06N 3/084 (2023.01); G06F 18/213 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2023.01)
CPC G06N 3/084 (2013.01) [G06F 18/213 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method implemented by one or more computers, the method comprising:
processing a network input using a neural network comprising a plurality of neural network layers to generate a network output, wherein each neural network layer is configured to process a respective layer input in accordance with respective values of a plurality of layer weights to generate a respective layer output, wherein one or more of the neural network layers is a conditional neural network layer, and wherein processing a layer input using a conditional neural network layer to generate a layer output comprises:
obtaining values of one or more decision parameters of the conditional neural network layer;
processing (i) the layer input, and (ii) the decision parameters of the conditional neural network layer, to determine values of one or more latent parameters of the conditional neural network layer from a continuous set of possible latent parameter values, wherein the one or more latent parameters of the conditional neural network layer parameterize a respective value of each of a plurality of layer weights of the conditional neural network layer as a B-spline or as a hyper-surface defined as a sum of B-splines;
determining the respective value of each of the plurality of layer weights of the conditional neural network layer from the values of the latent parameters of the conditional neural network layer; and
processing the layer input in accordance with the values of the plurality of layer weights of the conditional neural network layer, determined in accordance with the parametrization of the plurality of layer weights of the conditional neural network layer as the B-spline or the hyper-surface defined as the sum of B-splines, to generate the layer output;
wherein each B-spline is defined by a plurality of knots, and wherein the plurality of knots of each B-spline have been trained jointly with the decision parameters of the conditional neural network layer to optimize an objective function that measures a performance of the neural network on a machine learning task.