CPC G06F 16/2228 (2019.01) [G06F 16/243 (2019.01); G06F 16/2455 (2019.01)] | 12 Claims |
1. A computer-implemented method for interpreting a natural language search query based on a totality of training data comprising a first plurality of terms, the method comprising, using processing circuitry for:
determining a respective frequency with which each of the first plurality of terms occurs in the totality of the training data;
generating a relational data structure associating each term of the first plurality of terms with its respective frequency;
adding a term of the first plurality of terms to a list of relevant terms when its frequency is below a threshold, wherein adding the term of the plurality of terms to the list of relevant terms when its frequency is below the threshold includes:
determining whether the frequency of the term meets or exceeds a threshold frequency;
in response to determining that the frequency meets or exceeds the threshold frequency, determining that the respective term is not relevant; and
in response to determining that the frequency is below the threshold frequency, adding the term to the list of relevant terms;
receiving the natural language search query;
identifying a second plurality of terms in the natural language search query;
determining, for each term of the second plurality of terms, whether the term is included in the list of relevant terms;
in response to determining that a term of the second plurality of terms is included in the list of relevant terms, identifying the term as a keyword;
interpreting the natural language search query based on the identified keywords; and
performing a search for the natural language search query based on the identified keywords.
|