| CPC G06N 20/00 (2019.01) [G06F 18/22 (2023.01); G06F 18/23 (2023.01); G06F 18/2411 (2023.01); G06F 18/2431 (2023.01); G06N 3/084 (2013.01); G06N 3/09 (2023.01)] | 20 Claims |

|
12. A method for training a classifier, comprising:
receiving embeddings produced from a natural language model;
receiving raw datasets and seed datasets; and
repeating one or more iterations of labeling epochs until a first condition is met, the labeling epochs including:
generating raw vectors corresponding to the raw datasets and seed vectors corresponding to the seed datasets, based on the embeddings;
assigning pseudo class labels to the raw datasets based on distances between the raw vectors and the seed vectors; and
repeating one or more iterations of classification epochs until a second condition is met, the classification epochs including:
updating the embeddings by performing classification tasks using the seed vectors and the raw vectors corresponding to the pseudo class labels which are assumed as ground-truth labels.
|