CPC G06F 40/284 (2020.01) [G06F 16/35 (2019.01); G06Q 40/12 (2013.12)] | 18 Claims |
1. A computer implemented method for categorizing a transaction description, said method comprising:
receiving a description associated with a transaction;
extracting a plurality of candidate keywords from the description, wherein extracting the plurality of candidate keywords from the description comprises:
tokenizing the description into a plurality of tokens;
determining a frequency count of each token; and
if a token is alphanumeric and an associated frequency count is above a count threshold, extracting the token as a candidate keyword;
generating a plurality of n-grams from the plurality of candidate keywords, each n-gram comprising a set of one or more candidate keywords;
calculating a score for each n-gram based on the one or more associated candidate keywords;
determining the n-gram with a highest score; and
generating a modified description with the determined n-gram.
|