| CPC G06F 16/93 (2019.01) [G06N 3/08 (2013.01); G06V 30/412 (2022.01); G06V 30/416 (2022.01); G06V 30/418 (2022.01)] | 16 Claims |

|
1. A method comprising:
storing, by a processor, cached metadata about a plurality of historical documents in a relational database to increase computation speed;
determining, by the processor, a mean square error (MSE) based on a difference between grayscale pixel intensities of each corresponding pixel of each image in a first content in one or more of a plurality of historical documents and a second content of a new document;
determining, by the processor, match metrics based on a percentage of the second content of the new document that matches the first content in one or more of the plurality of historical documents based on the cached metadata and;
extracting, by the processor, the second content from regions of interest in the new document based on the match metrics; and
automatically preparing, by the processor, documents using the second content.
|