CPC G06N 3/08 (2013.01) [G06F 18/2148 (2023.01); G06F 18/22 (2023.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01); G06V 30/1916 (2022.01); G06V 30/19173 (2022.01); G06V 30/274 (2022.01)] | 20 Claims |
1. A method of visual representation learning, comprising:
receiving a set of image embeddings from an image representation model and a set of text embeddings from a text representation model; and
training, using a neural network and employing mutual information, a critic function by learning relationships between the set of image embeddings and the set of text embeddings.
|