| CPC G06F 16/78 (2019.01) [G06F 40/30 (2020.01); G06V 10/761 (2022.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01); G06V 30/19093 (2022.01)] | 20 Claims |

|
1. A method of training a neural network for finding and retrieving queried videos, comprising:
obtaining two video clips from a first dataset and providing the two video clips to two video encoders for training;
providing an output of each of the two video encoders to a cosine similarity calculator;
training a multi-mentor paradigm having at least two mentors by obtaining two textual inputs from a second dataset, wherein a first mentor is provided each textual input to provide a similarity value comparison and a second mentor is provided said two textual inputs to provide a word mover distance (WMD); and
using said output from said multi-mentor paradigm and said encoders, calculate a contrastive loss used to provide contrastive learning of video features for differentiating similarity and dissimilarity of video clips.
|