CPC G06N 3/08 (2013.01) [G06F 18/24 (2023.01); G06F 18/295 (2023.01); G06N 3/045 (2023.01)] | 20 Claims |
1. A computer-implemented method, comprising:
classifying, by a trained classifier, unlabeled data from a dataset, wherein the classifier includes a neural network;
providing iteratively, by the classifier to a policy gradient function, a reward signal for data/query pairs, wherein the reward signal is a combination of a normalized discounted cumulative gain and a discriminative score output by the classifier;
transferring, by the classifier to a ranker, learning from the classifying, wherein the ranker includes a neural network;
training, by the policy gradient function, the ranker; and
ranking, by the trained ranker, data from the dataset based on a query.
|