CPC G06N 3/047 (2023.01) [G06N 3/048 (2023.01); G06N 3/084 (2013.01)] | 20 Claims |
1. A method for training a machine-trained (MT) network, the method comprising:
using a set of training inputs to train parameters of the MT network according to an initial loss function that is a first combination of a plurality of possible loss functions defined by initial values for a plurality of coefficients;
using a set of validation inputs to compute an error measure for the MT network as trained by the first set of training inputs;
modifying the loss function for subsequent training of the MT network based on the error measure computed using the set of validation inputs to generate a modified loss function that is a second combination of the plurality of possible loss functions defined by modified values for the plurality of coefficients, wherein the plurality of coefficients are continuously differentiable with respect to a description length score that accounts for (i) an amount of information required to modify the loss function and (ii) improvements to predictiveness of the MT network based on modifications to the loss function; and
using the set of training inputs to train the parameters of the MT network according to the loss function as modified based on the error measure computed using the set of validation inputs.
|