CPC G06N 3/084 (2013.01) [G06F 18/213 (2023.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01)] | 20 Claims |
1. A method implemented by one or more computers, the method comprising:
processing a network input using a neural network comprising a plurality of neural network layers to generate a network output, wherein each neural network layer is configured to process a respective layer input in accordance with respective values of a plurality of layer weights to generate a respective layer output, wherein one or more of the neural network layers is a conditional neural network layer, and wherein processing a layer input using a conditional neural network layer to generate a layer output comprises:
obtaining values of one or more decision parameters of the conditional neural network layer;
processing (i) the layer input, and (ii) the decision parameters of the conditional neural network layer, to determine values of one or more latent parameters of the conditional neural network layer from a continuous set of possible latent parameter values, wherein the one or more latent parameters of the conditional neural network layer parameterize a respective value of each of a plurality of layer weights of the conditional neural network layer as a B-spline or as a hyper-surface defined as a sum of B-splines;
determining the respective value of each of the plurality of layer weights of the conditional neural network layer from the values of the latent parameters of the conditional neural network layer; and
processing the layer input in accordance with the values of the plurality of layer weights of the conditional neural network layer, determined in accordance with the parametrization of the plurality of layer weights of the conditional neural network layer as the B-spline or the hyper-surface defined as the sum of B-splines, to generate the layer output;
wherein each B-spline is defined by a plurality of knots, and wherein the plurality of knots of each B-spline have been trained jointly with the decision parameters of the conditional neural network layer to optimize an objective function that measures a performance of the neural network on a machine learning task.
|