| CPC G06V 20/44 (2022.01) [G06V 10/764 (2022.01); G06V 20/41 (2022.01); G06V 20/70 (2022.01)] | 20 Claims |

|
1. A method comprising:
identifying, by a device, a video depicting a key event, the video having a plurality of frames;
extracting, by the device, a sequence of frames from the plurality of frames;
determining, by the device, a camera view for each frame of the sequence of frames to form a sequence of camera views by applying a pre-trained panoptic segmentation model to identify relevant objects within the frame;
applying a pre-trained pose estimation model to estimate poses for any persons within the frame;
assigning a label to the frame based on a predetermined labeling function and semantic information yielded by the panoptic segmentation model and the pose estimation model; and
determining, by the device, a type of key event from the sequence of camera views by comparing the sequence of camera views to predetermined arrangements of camera views associated with different types of key events.
|