| CPC G06V 40/20 (2022.01) [G06V 10/56 (2022.01); G06V 10/757 (2022.01); G06V 10/761 (2022.01); G06V 10/806 (2022.01); G06V 20/46 (2022.01); G06V 20/48 (2022.01)] | 19 Claims |

|
1. An action recognition method performed by a computer device, comprising:
obtaining multiple video frames in a target video;
performing feature extraction on the multiple video frames respectively according to multiple dimensions to obtain multiple multi-channel feature patterns, each video frame corresponding to one multi-channel feature pattern, and each channel representing one dimension;
determining an attention weight of each multi-channel feature pattern based on a similarity between every two multi-channel feature patterns in the multiple multi-channel feature patterns comprising the each multi-channel feature pattern and another multi-channel feature pattern, the attention weight being used for representing a degree of correlation between a corresponding multi-channel feature pattern and an action performed by an object in the target video, the similarity between a multi-channel feature pattern pair being used for representing a magnitude of a motion performed by the object in the multiple video frames corresponding to the multi-channel feature pattern pair; and
determining a type of the action based on the multiple multi-channel feature patterns and the determined multiple attention weights.
|