US 12,456,323 B2
	Machine-learning models for image processing
Ashutosh K. Sureka, Irving, TX (US); Venkata Sesha Kiran Kumar Adimatyam, Irving, TX (US); Miriam Silver, Tel Aviv (IL); Daniel Funken, Irving, TX (US); Toan Pham, Irving, TX (US); Vicky Kapadia, Irving, TX (US); Sakthivel Palanivel, Irving, TX (US); and Anurag Bhakoo, Irving, TX (US)
Assigned to CITIBANK, N.A., New York, NY (US)
Filed by Citibank, N.A., New York, NY (US)
Filed on May 23, 2025, as Appl. No. 19/217,461.
Application 19/217,461 is a continuation in part of application No. 18/943,672, filed on Nov. 11, 2024, granted, now 12,347,221.
Application 18/943,672 is a continuation in part of application No. 18/629,259, filed on Apr. 8, 2024, granted, now 12,266,145, issued on Apr. 1, 2025.
Prior Publication US 2025/0316109 A1, Oct. 9, 2025
Int. Cl. G06K 9/00 (2022.01); G06V 30/146 (2022.01); G06V 30/413 (2022.01); G06V 30/416 (2022.01); G06V 30/42 (2022.01)

CPC G06V 30/416 (2022.01) [G06V 30/146 (2022.01); G06V 30/413 (2022.01); G06V 30/42 (2022.01)]

20 Claims

1. A method for client-side processing and validation of document imagery, the method comprising:

obtaining, by a computing device associated with an end-user, video data comprising a plurality of frames having image data containing a document from a camera of the computing device using an imaging software program locally executed on the computing device;

obtaining, by the computing device from the imaging software program, first textual content of the document from a first of the plurality of frames and second textual content of the document in a second of the plurality of frames;

identifying, by the computing device, a front of the document based on the first textual content received from the imaging software program and a back of the document based on the second textual content received from the imaging software program;

generating, by the computing device, a first annotation label for first image data of the first frame to indicate an identification of the front of the document in response to identifying the front of the document in the first image data using the first textual content; and

generating, by the computing device, a second annotation label to indicate an identification of the back of the document for second image data of the second frame in response to identifying the back of the document in the second image data using the second textual content.