| CPC G06Q 10/1053 (2013.01) | 20 Claims |

|
1. A method comprising:
extracting an input text from an online system, the input text comprising a job title;
applying an unsupervised generative text machine learning model to the input text;
generating, by the generative text machine learning model, a plurality of sentences based on the job title;
extracting one or more skills from the plurality of sentences, wherein the extracted one or more skills correspond to one or more skills in a skill taxonomy;
generating a frequency distribution over the extracted one or more skills;
ranking each skill of the extracted one or more skills based on the frequency distribution;
comparing the frequency distribution to a threshold skill distribution;
in response to determining that the frequency distribution does not satisfy the threshold skill distribution, generating additional sentences from the input text;
generating an additional frequency distribution using the plurality of sentences and the additional sentences;
comparing the additional frequency distribution to the threshold skill distribution;
in response to determining that the additional frequency distribution satisfies the threshold skill distribution, ranking each skill of the extracted one or more skills based on the additional frequency distribution;
generating a subset of the extracted one or more skills based on the ranking and a threshold number of skills; and
providing the subset of the extracted one or more skills to a downstream operation, process, or service of the online system.
|