CPC G06F 40/247 (2020.01) [G06F 40/40 (2020.01); G06N 5/022 (2013.01)] | 5 Claims |
1. A method for disambiguating a term, comprising:
generating a knowledge set that includes concepts, and training the knowledge set with information sources that contain the concepts using a machine learning technique, to define weights of associations between the concepts;
extracting, from an information source, a listing of concepts and an ambiguous term having two or more potential meanings;
identifying, from the listing of extracted concepts, ones that are associated with each of the potential meanings, as associated concepts, according to the knowledge set that defines associations between concepts;
assessing a likelihood of each potential meaning representing actual meaning of the ambiguous term, based on (a) a strength of the associations between each associated concept and the potential meaning weighted based on a token distance between occurrences of each associated concept and the potential meaning, (b) a frequency of occurrence of each potential meaning in the knowledge set, and (c) directionality of the association between each associated concept and the ambiguous term, wherein the directionality between two concepts indicates whether the occurrence of one concept more likely leads to the occurrence of the other concept than the reverse;
determining one of the potential meanings having the highest likelihood based on the assessment as representing the actual meaning for the ambiguous term, thereby disambiguating the term; and
generating a record, in a non-transitory media, of the actual meaning for the term.
|