CPC G06F 9/453 (2018.02) [G06F 17/18 (2013.01); G06F 18/24 (2023.01); G06F 18/41 (2023.01); G06F 40/169 (2020.01); G06N 20/00 (2019.01); G06F 16/285 (2019.01)] | 13 Claims |
1. A method for improving performance of a computer implementing a machine learning system, said method comprising:
providing, via a graphical user interface, to an annotator, unlabeled corpus data;
obtaining, via said graphical user interface, labels for said unlabeled corpus data;
detecting, with a consistency calculation routine, concurrent with said obtaining of said labels, at least internal inconsistency in said labels based on a comparison of an inconsistency measurement in relation to a given threshold, said detecting including periodically retesting said annotator on a portion of data previously-labeled by said annotator, the periodic retesting comprising re-presenting in unlabeled form via said graphical user interface said portion of data that was previously-labeled by said annotator and, in response, receiving a new label from the annotator, the inconsistency measurement being based on a determination of whether said new label is consistent with an initial label provided previously by the annotator to respond to an initial presentation of said portion of data;
responsive to said detection of said internal inconsistency, intervening in said obtaining of said labels, concurrent with said obtaining with said labels, with a reactive intervention subsystem until said internal inconsistency in said labels is addressed;
completing said obtaining of said labels subsequent to said intervening;
carrying out training of said machine learning system to provide a trained machine learning system, based on results of said completing of said obtaining of said labels subsequent to said intervening; and
carrying out classifying new data with said trained machine learning system.
|