US 11,657,631 B2
Scalable, flexible and robust template-based data extraction pipeline
Christos Sagonas, London (GB); Karolina Dabkowska, London (GB); Zhiyuan Shi, London (GB); Edward Fieri Soler, London (GB); Mohan Mahadevan, London (GB); Iona Grace Vincent, London (GB); Luca Peric, London (GB); Alessandro Lenzi, London (GB); Alvaro Fernando Lara, London (GB); and James Stonehill, London (GB)
Assigned to Onfido Ltd., London (GB)
Filed by ONFIDO LTD, London (GB)
Filed on Apr. 28, 2021, as Appl. No. 17/243,467.
Claims priority of application No. 20172169 (EP), filed on Apr. 29, 2020.
Prior Publication US 2021/0343030 A1, Nov. 4, 2021
Int. Cl. G06T 7/30 (2017.01); G06T 7/136 (2017.01); G06T 7/11 (2017.01); G06T 7/70 (2017.01); G06K 9/62 (2022.01); G06T 3/40 (2006.01); G06V 10/22 (2022.01); G06V 30/414 (2022.01); G06V 30/10 (2022.01)
CPC G06T 7/30 (2017.01) [G06K 9/6256 (2013.01); G06K 9/6267 (2013.01); G06T 3/40 (2013.01); G06T 7/11 (2017.01); G06T 7/136 (2017.01); G06T 7/70 (2017.01); G06V 10/22 (2022.01); G06V 30/414 (2022.01); G06T 2207/20132 (2013.01); G06T 2207/30176 (2013.01); G06V 30/10 (2022.01)] 15 Claims
OG exemplary drawing
 
1. A computer-implemented method for extracting data from a document comprising:
acquiring an input image comprising a document portion, the document portion being of a document of a first document type;
performing image segmentation on the input image to form a binary input image that distinguishes the document portion from the remaining portion of the input image;
estimating a first image transform to align the binary input image to a binary template image;
using the first image transform on the input image to form an intermediate image;
estimating a second image transform to align the intermediate image to a template image, the template image comprising a template document portion, the template document portion being of a different document of the first document type;
using the second image transform on the intermediate image to form an output image; and
extracting a field from the output image using a predetermined field of the template image.