CPC G06F 16/904 (2019.01) [G06F 40/284 (2020.01); G06N 20/00 (2019.01); G06V 30/412 (2022.01)] | 10 Claims |
1. A special purpose computer implemented method of visualizing a machine learning match of a receipt with a line of a document, the method comprising:
tokenizing the line of the document into a document vector of document tokens;
lemmatizing each token in the document vector of the document tokens, and storing the lemmatized document tokens in the document vector of the document tokens;
creating a document term frequency inverse document frequency vector for the document vector of the document tokens;
looping through one or more receipts in a set of receipts, reviewing each receipt, tokenizing a plurality of lines of the receipt into a receipt vector of receipt tokens, wherein the receipt vector includes a location indicator of the location of the receipt token in the receipt;
lemmatizing each receipt token in the receipt vector of the receipt tokens and storing the lemmatized receipt tokens in the receipt vector of the receipt tokens;
creating a receipt term frequency inverse document frequency vector for the receipt vector of the receipt tokens;
comparing the document term frequency inverse document frequency vector to the receipt term frequency inverse document frequency vector using a similarity algorithm to calculate a confidence score and storing the confidence score for each receipt;
determining a matching receipt by selecting the receipt with a highest confidence score;
displaying an indication of the highest confidence score with a variable icon;
displaying the receipt associated with the highest confidence score.
|