CPC G06V 30/41 (2022.01) [G06F 40/174 (2020.01); G06N 3/048 (2023.01); G06N 3/08 (2013.01); G06V 30/153 (2022.01)] | 20 Claims |
1. A method comprising:
receiving, by a processing device, one or more documents;
performing optical character recognition on the one or more documents to detect words comprising symbols in the one or more documents;
determining an encoding value for each of the symbols;
applying a first hash function to each encoding value to generate a first set of hashed symbol values;
applying a second hash function to each hashed symbol value of the first set of hashed symbol values to generate a vector array comprising a second set of hashed symbol values;
applying a linear transformation to each value of the second set of hashed symbol values of the vector array;
applying an irreversible non-linear activation function to the vector array to obtain abstract values associated with the symbols; and
saving the abstract values to train a neural network to detect fields in an input document.
|