CPC H04N 21/23418 (2013.01) [G06T 7/246 (2017.01); G06V 20/46 (2022.01); G06T 2207/10021 (2013.01)] | 20 Claims |
1. A system comprising:
a processor; and
a computer-readable medium storing instructions that are operative upon execution by the processor to:
receive a video stream comprising a plurality of video frames;
group the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame;
determine a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame;
weight the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and
based on at least the set of weighted historical video frames and the set of present video frames, generate an action prediction for the current video frame.
|