US 11,694,463 B2
Systems and methods for generating document numerical representations
Jerome Gleyzes, Wellington (NZ); Mohamed Khodeir, Wellington (NZ); Salim Fakhouri, Wellington (NZ); Yu Wu, Wellington (NZ); and Soon-Ee Cheah, Wellington (NZ)
Assigned to XERO LIMITED
Filed by Xero Limited
Filed on Jul. 20, 2022, as Appl. No. 17/869,044.
Application 17/869,044 is a continuation of application No. PCT/NZ2021/050133, filed on Aug. 19, 2021.
Claims priority of application No. 2021900419 (AU), filed on Feb. 18, 2021.
Prior Publication US 2022/0358779 A1, Nov. 10, 2022
Int. Cl. G06K 9/46 (2006.01); G06K 9/66 (2006.01); G06V 30/416 (2022.01); G06V 30/19 (2022.01); G06V 30/18 (2022.01); G06V 30/14 (2022.01)
CPC G06V 30/416 (2022.01) [G06V 30/1448 (2022.01); G06V 30/18 (2022.01); G06V 30/19147 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
determining a candidate document comprising image data and character data;
extracting the image data and the character data from the candidate document;
providing, to an image-based numerical representation generation model, the image data;
generating, by the image-based numerical representation generation model, an image-based numerical representation of the image data;
providing, to a character-based numerical representation generation model, the character data;
generating, by the character-based numerical representation generation model, a character-based numerical representation of the character data;
providing, to a consolidated image-character based numerical representation generation model, the image-based numerical representation and the character-based numerical representation;
generating, by the consolidated image-character based numerical representation generation model, a combined image-character based numerical representation of the candidate document;
comparing the combined image-character based numerical representation of the candidate document with an index of combined image-character based numerical representations, each combined image-character based numerical representations of the index being indicative of a respective document having a first attribute value;
determining a combined image-character based numerical representation of the index that substantially corresponds with the combined image-character based numerical representation of the candidate document; and
associating the candidate document with the first attribute value of the determined combined image-character based numerical representation of the index;
wherein the respective document of each combined image-character based numerical representation of the index has a second attribute value, and wherein associating the candidate document with the first attribute value of the determined combined image-character based numerical representation of the index further comprises associating the candidate document with the second attribute value of the determined combined image-character based numerical representation of the index.