| CPC G06F 40/268 (2020.01) [G06F 40/30 (2020.01)] | 9 Claims |

|
1. A non-transitory computer-readable recording medium having stored therein an information processing program that causes a computer to execute a process comprising:
classifying a plurality of words contained in predetermined document data under categories defined in a thesaurus;
generating a co-occurrence rate table based on a relationship between polysemous words contained in the predetermined document data and a category of words co-occurring with the polysemous words classified in a certain semantic division;
separating input text into a plurality of words, by performing a morpheme analysis on the input text;
identifying a category to which each of the plurality of words belongs;
identifying, from among the plurality of words contained in the input text, a polysemous word and a semantic division of the polysemous word, based on the category identified and the co-occurrence rate table; and
assigning a vector corresponding to the semantic division of the polysemous word, to the polysemous word contained in the input text.
|