US 11,983,926 B2
Video content recognition method and apparatus, storage medium, and computer device
Yan Li, Shenzhen (CN); Bin Ji, Shenzhen (CN); Xintian Shi, Shenzhen (CN); and Bin Kang, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by Tencent Technology (Shenzhen) Company Limited, Shenzhen (CN)
Filed on Feb. 17, 2022, as Appl. No. 17/674,688.
Application 17/674,688 is a continuation of application No. PCT/CN2020/122152, filed on Oct. 20, 2020.
Claims priority of application No. 202010016375.2 (CN), filed on Jan. 8, 2020.
Prior Publication US 2022/0172477 A1, Jun. 2, 2022
Int. Cl. G06V 20/40 (2022.01); G06V 10/80 (2022.01); G06V 10/82 (2022.01)
CPC G06V 20/46 (2022.01) [G06V 10/80 (2022.01); G06V 10/82 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A video content recognition method performed by a computer device, the method comprising:
obtaining an image feature corresponding to a video frame set extracted from a target video, the video frame set comprising at least two video frames;
dividing the image feature into a plurality of image sub-features based on a plurality of channels of the image feature according to a preset sequence, and each image sub-feature comprising a feature of each video frame on a corresponding channel;
choosing, from the plurality of image sub-features based on the preset sequence, a current image sub-feature;
fusing the current image sub-feature and a convolution processing result of a previous image sub-feature into a fused image sub-feature, and performing convolution processing on the fused image sub-feature, to obtain a convolved image sub-feature corresponding to the current image sub-feature;
splicing a plurality of convolved image sub-features corresponding to the plurality of channels of the convolved image sub-feature, to obtain a spliced image feature; and
determining video content corresponding to the target video based on the spliced image feature.