CPC G06N 20/00 (2019.01) | 17 Claims |
1. A method for training classifiers used in machine learning, the method comprising:
receiving, by one or more processors of a computer system, a corpus of training data;
generating, by the one or more processors of the computer system, one or more clusters of the training data according to features of the training data;
comparing, by a cluster refining rule elicitation module of the computer system, neighboring clusters of the one or more clusters to automatically extract boundary rules for suggestion of a user for selection, confirmation and/or editing;
interacting with the user, by the cluster refining rule elicitation module of the elicit user-specified rules by suggesting the extracted boundary rules;
automatically refining, by the one or more processors of the computer system, the one or more clusters using the user-specified rules, wherein the refining further includes: assigning subsets of the corpus of training data in or out of clusters of the one or more clusters using the user-specified rules, and using a generative model to resolve conflicts in the assigning; and
training, by one or more processors of a computer system, multiple classifiers for use in machine learning based upon the refined one or more clusters.
|