CPC G06V 40/20 (2022.01) [G06F 18/213 (2023.01); G06T 7/246 (2017.01); G06N 3/02 (2013.01); G06T 2207/20081 (2013.01)] | 19 Claims |
1. An action recognition method, performed by a computer device, the method comprising:
obtaining image data of video data in a plurality of different temporal frames;
obtaining original feature submaps of each of the temporal frames on a plurality of different convolutional channels by using a multi-channel convolutional layer;
calculating, by using each of the temporal frames as a target temporal frame, motion information weights of the target temporal frame on the convolutional channels according to the original feature submaps of the target temporal frame on the convolutional channels and the original feature submaps of a next temporal frame adjacent to the target temporal frame on each of the convolutional channels;
obtaining motion information feature maps of the target temporal frame on the convolutional channels according to the motion information weights and the original feature submaps of the target temporal frame on the convolutional channels;
performing temporal convolution on the motion information feature maps to obtain temporal motion feature maps of the target temporal frame on the convolutional channels; and
recognizing an action type of a moving object in image data of the target temporal frame according to the temporal motion feature maps.
|