| CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01)] | 20 Claims |

|
1. A method for training a neural network to perform a first prediction task, the method comprising:
obtaining trained model parameters for each of a plurality of candidate neural networks, wherein each candidate neural network has been pre-trained to perform a respective second prediction task that is different from the first prediction task;
obtaining a plurality of training examples corresponding to the first prediction task;
prior to fine-tuning any of the plurality of candidate neural networks for the first prediction task:
predicting, for each plurality of candidate neural networks and using the plurality of training examples, a respective performance of the candidate neural network on the first prediction task, and
selecting a proper subset of the plurality of candidate neural networks using the respective predicted performance on the first prediction task for each of the candidate neural networks;
after selecting the proper subset, fine-tuning only the candidate neural networks in the proper subset for the first prediction task by generating, for each candidate neural network in the proper subset, one or more fine-tuned neural networks, wherein each of the one or more fine-tuned neural networks is generated by updating the model parameters of the candidate neural network using the plurality of training examples; and
determining model parameters for the neural network using the one or more fine-tuned neural networks.
|