CPC G06F 16/3337 (2019.01) [G06F 16/313 (2019.01); G06F 40/284 (2020.01); G06F 40/53 (2020.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A system comprising:
one or more processors; and
one or more machine-readable storage media having instructions stored thereon that, in response to being executed by the one or more processors, cause the system to perform operations comprising:
in response to receiving an input symbol, retrieving an index file from a server, wherein the index file includes a plurality of word tokens;
determining, by a preprocessing module, a frequency of each word token included in the index file;
calculating, by the preprocessing module, respective frequency scores based on the determining the frequency;
identifying, by the preprocessing module, important word tokens based in part of the frequency scores, wherein the identifying the important word tokens includes determining a difference between the frequency scores of each word token and a frequency token threshold; and
determining a match between the received input symbol and the plurality of word tokens, wherein the determining the match includes using an elimination criteria and the important word tokens.
|