CPC G06V 30/416 (2022.01) [G06V 30/1448 (2022.01); G06V 30/18 (2022.01); G06V 30/19147 (2022.01)] | 17 Claims |
1. A method comprising:
determining a candidate document comprising image data and character data;
extracting the image data and the character data from the candidate document;
providing, to an image-based numerical representation generation model, the image data;
generating, by the image-based numerical representation generation model, an image-based numerical representation of the image data;
providing, to a character-based numerical representation generation model, the character data;
generating, by the character-based numerical representation generation model, a character-based numerical representation of the character data;
providing, to a consolidated image-character based numerical representation generation model, the image-based numerical representation and the character-based numerical representation;
generating, by the consolidated image-character based numerical representation generation model, a combined image-character based numerical representation of the candidate document;
comparing the combined image-character based numerical representation of the candidate document with an index of combined image-character based numerical representations, each combined image-character based numerical representations of the index being indicative of a respective document having a first attribute value;
determining a combined image-character based numerical representation of the index that substantially corresponds with the combined image-character based numerical representation of the candidate document; and
associating the candidate document with the first attribute value of the determined combined image-character based numerical representation of the index;
wherein the image-based numerical representation generation model, the character-based numerical representation generation model and the consolidated image-character based numerical representation generation model are trained using an objective function configured to maximise a similarity metric between numerical representations of training documents with an identifier common set of attributes.
|