CPC G06N 20/00 (2019.01) | 19 Claims |
1. A method for training a classification model in the presence of label noise, the method comprising:
scoring a second dataset of labelled data on a prior generation of a classification model, wherein the prior generation was trained on a first dataset of labelled data;
training a subsequent generation of a classification model with the second dataset of labelled data, wherein the second dataset of labelled data includes label noise whereby at least some of the labelled data is labelled incorrectly;
wherein in training of the subsequent generation, weighting of at least some of the labelled data in the second dataset is adjusted based on the score of such labelled data in the prior generation;
wherein a score for a given item of labelled data in the second dataset indicates a probability that the given item does or does not belong to a first classification, and wherein in assigning a weight, an item with a higher probability of belonging or not belonging to the first classification is assigned a weight lower than an item with a lower probability of belonging or not belonging to the first classification; and
providing the subsequent generation of a classification model as an update for a machine learning system for purposes of classification, correlation or prediction.
|