US 11,893,818 B2
Optimization and use of codebooks for document analysis
Ivan Zagaynov, Dolgoprudniy (RU); Vasily Loginov, Moscow (RU); Stanislav Semenov, Moscow (RU); and Aleksandr Valiukov, St. Petersburg (RU)
Assigned to ABBYY Development Inc., Dover, DE (US)
Filed by ABBYY Development Inc., Dover, DE (US)
Filed on Jul. 26, 2021, as Appl. No. 17/384,985.
Claims priority of application No. RU2021121680 (RU), filed on Jul. 21, 2021.
Prior Publication US 2023/0028992 A1, Jan. 26, 2023
Int. Cl. G06V 30/416 (2022.01); G06T 5/30 (2006.01); G06T 5/40 (2006.01); G06F 18/22 (2023.01); G06F 18/21 (2023.01); G06V 30/18 (2022.01); G06V 10/46 (2022.01)
CPC G06V 30/416 (2022.01) [G06F 18/2163 (2023.01); G06F 18/22 (2023.01); G06T 5/30 (2013.01); G06T 5/40 (2013.01); G06V 10/462 (2022.01); G06V 30/18143 (2022.01); G06T 2207/20076 (2013.01); G06T 2207/30176 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, by a processing device, a first set of document images;
extracting a plurality of keypoint regions from each document image of the first set of document images;
calculating local descriptors for each keypoint region of the extracted keypoint regions;
clustering the local descriptors such that each center of a cluster of local descriptors corresponds to a respective visual word;
generating a codebook containing a set of visual words;
optimizing the codebook by maximizing mutual information (MI) between a target field of a second set of document images and at least one visual word of the set of visual words; and
detecting, using the codebook, one or more fields in a document image.