CPC G06N 3/08 (2013.01) [G06N 3/084 (2013.01); G06N 7/01 (2023.01); H04L 1/24 (2013.01)] | 18 Claims |
1. A method for training a neural network comprising a plurality of nodes that use a plurality of weights, wherein each node of a set of the nodes produces a node output value by computing a dot product of weight values for the node and input values for the node that are node output values of previous nodes, the method comprising:
propagating a plurality of inputs through the neural network to generate an output for each of the inputs, wherein each weight of a set of the weights is defined as a probability distribution across a set of allowable values for the weight, wherein for each weight, the set of allowable values for the weight comprises the value zero, a positive value for the weight, and a negation of the positive value for the weight, wherein propagating a particular input through the neural network comprises, for at least a particular node:
computing a node output value probability distribution by computing (i) a mean node output value for the particular node based on a dot product of means of the weight values for the particular node and the input values for the particular node and (ii) a variance for the particular node based on variances of the weight values for the particular node and the input values for the particular node; and
randomly sampling from the computed node output value probability distribution for the particular node to determine the node output value for the particular node; and
using the outputs generated for the plurality of inputs to train the weights.
|