US 12,292,869 B2
System and method for processing one or more electronic documents for enhanced search
Reghupathi Hariharan, Marlborough, MA (US); Shubham Kackar, Mumbai (IN); Palash Nimodia, Mumbai (IN); Neelesh Kumar Yadav, Mumbai (IN); and V B Krishna Sai Phani Kumar Avanigadda, Mumbai (IN)
Filed by Quantiphi, Inc, Marlborough, MA (US)
Filed on Jun. 29, 2023, as Appl. No. 18/343,923.
Prior Publication US 2025/0005007 A1, Jan. 2, 2025
Int. Cl. G06F 16/22 (2019.01); G06V 30/14 (2022.01)
CPC G06F 16/22 (2019.01) [G06V 30/1448 (2022.01)] 18 Claims
OG exemplary drawing
 
1. A method for processing one or more electronic documents for enhanced search, the method comprising:
defining, by a processor, a bounding box around each key and each corresponding value of a plurality of key-value pairs in a first schema file,
tagging, by the processor, a set of four coordinates of a key corresponding to a first bounding box and a set of four coordinates of a value corresponding to a second bounding box in the first schema file;
obtaining, by the processor, a first inference file from a client device;
detecting, by the processor, a set of four coordinates of a key corresponding to a third bounding box from the first inference file based on the set of four coordinates of the key corresponding to the first bounding box in the schema file;
determining, a set of four coordinates corresponding to a fourth bounding box encompassing the value of the first inference file, wherein the set of four coordinates corresponding to the fourth bounding box encompassing the value of the first inference file are determined by applying a normalization operation using each of the set of four coordinates corresponding to the first bounding box, the second bounding box, and the third bounding box;
extracting, by the processor, the value encompassed by the fourth bounding box of the first inference file; and
automatically creating, by the processor, a searchable index of the first inference file with searchable key-value pairs based on extraction of at least the value encompassed by the fourth bounding box of the first inference file,
wherein the first schema file is an annotated file in a pre-defined format, in which a text portion of each key and the corresponding value of the plurality of key-value pairs are stored along with the set of four coordinates.