| CPC G06V 30/133 (2022.01) [G06V 30/19147 (2022.01); G06V 30/26 (2022.01)] | 20 Claims |

|
1. A computer-implemented method, comprising:
receiving a document that includes text obtained at least in part through OCR;
applying an adjusted bidirectional-and-auto-regressive-transformers (BART) model to the text to detect at least one error in a subset of the text, the adjusted BART model having been adjusted from a BART model pretrained to perform a non-optical character recognition (non-OCR) task using a first training dataset comprising corrupted text data and the adjusted BART model further being adjusted from the BART model to perform an OCR task using a second training dataset comprising OCR samples; and
generating, by applying the adjusted BART model to the text of the document, an updated subset of the text correcting the at least one error in the subset of the text.
|