US 11,914,630 B2
Classifier determination through label function creation and unsupervised learning
Yang Wu, San Jose, CA (US); Jiadi Xiong, San Jose, CA (US); Yaqin Yang, Santa Clara, CA (US); and Dinesh Kumar, San Jose, CA (US)
Assigned to PayPal, Inc., San Jose, CA (US)
Filed by PayPal, Inc., San Jose, CA (US)
Filed on Sep. 30, 2021, as Appl. No. 17/491,149.
Prior Publication US 2023/0102892 A1, Mar. 30, 2023
Int. Cl. G06F 16/35 (2019.01); G06F 18/214 (2023.01); G06F 18/2415 (2023.01); G06F 40/30 (2020.01); G06K 9/62 (2022.01); G06N 20/00 (2019.01)
CPC G06F 16/355 (2019.01) [G06F 18/2155 (2023.01); G06F 18/2415 (2023.01); G06F 40/30 (2020.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
generating, by a computer system, one or more labelling functions for at least one text data category, wherein the one or more labelling functions are generated based on contextual patterns for the at least one text data category, and wherein the contextual patterns are extracted from a lexical database of semantic relations between words;
accessing, by the computer system, a dataset that includes a plurality of unlabeled text data in a freeform format;
determining, by the computer system, a set of probabilistic labels for the at least one text data category by applying the generated labelling functions to the unlabeled text data using an unsupervised machine learning algorithm;
providing the unlabeled text data along with the set of probabilistic labels to a transformer-based machine learning algorithm; and
generating, by the computer system, one or more classifiers for the transformer-based machine learning algorithm by refining one or more predetermined classifiers for the transformer-based machine learning algorithm to classify the unlabeled text based on the set of probabilistic labels, wherein the one or more classifiers are generated to classify text data into the at least one text data category.