| CPC G06V 10/761 (2022.01) [G06N 3/088 (2013.01); G06V 10/771 (2022.01); G06V 10/7715 (2022.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01)] | 20 Claims |

|
1. A method for image processing, comprising:
identifying a plurality of candidate concepts in a knowledge graph (KG) that correspond to an image tag of an image, wherein the knowledge graph comprises a plurality of nodes corresponding to the plurality of candidate concepts;
generating an image embedding of the image using a multi-modal encoder;
generating a text embedding for each of the plurality of candidate concepts using the multi-modal encoder used to generate the image embedding, wherein the image embedding and the text embedding are located in a same embedding space;
selecting a matching concept from the plurality of candidate concepts based on the image embedding and the text embedding;
generating association data between the image and the matching concept; and
transmitting information from the knowledge graph corresponding to the image based on the association data between the image and the matching concept.
|