| CPC G06V 20/48 (2022.01) [G06V 10/761 (2022.01); G06V 10/774 (2022.01); G06V 10/776 (2022.01); G06V 10/82 (2022.01); G06V 20/41 (2022.01); G06V 20/46 (2022.01); G06V 20/49 (2022.01); G10L 19/008 (2013.01)] | 20 Claims |

|
1. A method for video loop recognition, comprising:
acquiring a first video clip pair from a video, the first video clip pair comprising a first video clip and a second video clip from the video, the first video clip and the second video clip being from different time intervals of the video;
determining a first encoding feature from the first video clip pair, the first encoding feature being associated with first modal information;
determining a second encoding feature from the first video clip pair, the second encoding feature being associated with second modal information that is different from the first modal information, each of the first modal information and the second modal information being selected from video modal information, audio modal information, speech text modal information, video title modal information, and cover modal information;
acquiring a multi-modal neural network model that performs a loop recognition on the video, the multi-modal neural network model comprising a first sequence model associated with the first modal information and a second sequence model associated with the second modal information;
inputting the first encoding feature to the first sequence model that outputs a first similarity result for the first video clip pair;
inputting the second encoding feature to the second sequence model that outputs a second similarity result for the first video clip pair; and
obtaining a loop comparison result of the first video clip pair based on a comparison of the first similarity result with the second similarity result, the loop comparison result indicating a video type of the video.
|