CPC G06V 20/46 (2022.01) [G06V 10/80 (2022.01); G06V 10/82 (2022.01)] | 20 Claims |
1. A video content recognition method performed by a computer device, the method comprising:
obtaining an image feature corresponding to a video frame set extracted from a target video, the video frame set comprising at least two video frames;
dividing the image feature into a plurality of image sub-features based on a plurality of channels of the image feature according to a preset sequence, and each image sub-feature comprising a feature of each video frame on a corresponding channel;
choosing, from the plurality of image sub-features based on the preset sequence, a current image sub-feature;
fusing the current image sub-feature and a convolution processing result of a previous image sub-feature into a fused image sub-feature, and performing convolution processing on the fused image sub-feature, to obtain a convolved image sub-feature corresponding to the current image sub-feature;
splicing a plurality of convolved image sub-features corresponding to the plurality of channels of the convolved image sub-feature, to obtain a spliced image feature; and
determining video content corresponding to the target video based on the spliced image feature.
|