US 12,259,873 B2
	Agnostic image digitizer to detect fraud
James Siekman, Charlotte, NC (US); Aubrey Breon Farrar, Sr., Waldorf, MD (US); Mohamed Faris Khaleeli, Charlotte, NC (US); Patricia Ann Albritton, Charlotte, NC (US); Sheila Page, Charlotte, NC (US); Mark Alan Odiorne, Waxhaw, NC (US); and Marcus R. Matos, Richardson, TX (US)
Assigned to Bank of America Corporation, Charlotte, NC (US)
Filed by Bank of America Corporation, Charlotte, NC (US)
Filed on Nov. 15, 2023, as Appl. No. 18/509,719.
Application 18/509,719 is a continuation of application No. 17/874,714, filed on Jul. 27, 2022, granted, now 11,874,823.
Prior Publication US 2024/0086395 A1, Mar. 14, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/00 (2019.01); G06F 16/23 (2019.01); G06F 21/62 (2013.01)

CPC G06F 16/2365 (2019.01) [G06F 21/6218 (2013.01)]

18 Claims

1. A method for detection of fraudulent submission of a document by using a legacy database of an organization to train a machine learning artificial intelligence (AI) system, the method comprising:

receiving, at a computer hardware processor, legacy documents in a first format from the legacy database of the organization;

digitizing, using optical character recognition (OCR) run on the computer hardware processor, the documents in the first format into a digital format;

wherein:

the first format is incompatible with the computer hardware processor; and

the legacy documents are stored on the legacy database;

converting, using the computer hardware processor, the legacy documents in the digital format into a second format, wherein the second format is compatible with the computer hardware processor;

storing, using the computer hardware processor, the legacy documents in the second format on the legacy database;

training, using a graphics processing unit (GPU) in electronic communication with the computer hardware processor, a machine learning AI system, using the legacy documents stored in the second format in the legacy database, wherein the machine learning AI system is configured to utilize information about an entity submitting a document with patterns learned from the legacy documents to determine a predicted value of a data field in a document;

receiving from an entity, at the computer hardware processor, a first document in the first format that comprises populated data fields;

digitizing, using OCR run on the computer hardware processor, the first document in the first format into a digital format;

converting, using the computer hardware processor, the first document in the digital format into a second format;

ascertaining, using the computer hardware processor, a post-population confidence level for predicting data entries for fields in the first document in the second format using the machine learning AI system;

when the post-population confidence level exceeds a first pre-determined threshold, using the GPU to run the machine learning AI system to predict data entries for the populated data fields in the first document in the second format based on information about the entity submitting the first document and patterns learned from the legacy documents;

determining, using the computer hardware processor, a number of differences between predicted data entries for the populated data fields of the first document as determined by the GPU running the machine learning AI system and the populated data fields in the first document provided by the entity;

when the number of differences found exceeds a second pre-determined threshold, implementing a corrective action, using the computer hardware processor;

wherein:

said corrective action comprises running a first algorithm on the computer hardware processor; and

said first algorithm comprises an algorithm that assesses a potential threat of fraud;

when the first algorithm determines that there is a threat of fraud, providing, using the computer hardware processor, a fraud alert to the organization that includes alerting the organization as to the number of differences found between the predicted data entries for the populated data fields of the first document as determined by the GPU running the machine learning AI system and the populated data fields in the first document provided by the entity.