US 11,657,280 B1
Reinforcement learning techniques for network-based transfer learning
Yue Guo, Pittsburgh, PA (US); I-Hsuan Yang, Mountain View, CA (US); and Yu Wang, San Jose, CA (US)
Assigned to PlusAI, Inc., Santa Clara, CA (US)
Filed by PlusAI, Inc., Santa Clara, CA (US)
Filed on Nov. 1, 2022, as Appl. No. 17/978,491.
Int. Cl. G06N 3/08 (2023.01)
CPC G06N 3/08 (2013.01) 19 Claims
OG exemplary drawing
 
1. A method comprising:
obtaining, by a computing device, a source neural network that is associated with a set of parameters that have been determined based at least in part on a machine-learning algorithm and source training data associated with a source domain, the source training data being inaccessible to the computing device;
initializing, by the computing device, a plurality of candidate target neural networks with a subset of parameters respectively transferred from the set of parameters associated with the source neural network;
obtaining, by the computing device, target training data associated with a target domain, the target domain being different from the source domain;
generating, by the computing device, a plurality of trained candidate target neural networks based at least in part on training each of the plurality of candidate target neural networks utilizing a reinforcement learning algorithm, wherein training a candidate target neural network comprises:
selecting the candidate target neural network according to the reinforcement learning algorithm; and
finetuning one or more parameters of the candidate target neural network based on providing the candidate target neural network with a subset of the target training data as input, wherein the one or more parameters that are finetuned correspond to the subset of parameters with which the candidate target neural network was initialized;
selecting, by the computing device, a trained candidate target neural network from the plurality of trained candidate target neural networks based on subsequent output provided by the plurality of trained candidate target neural networks; and
performing, by the computing device, one or more operations based at least in part on the trained candidate target neural network selected.