US 11,895,343 B2
Video frame action detection using gated history
Gaurav Mittal, Redmond, WA (US); Ye Yu, Redmond, WA (US); Mei Chen, Bellevue, WA (US); and Junwen Chen, Rochester, NY (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Jun. 28, 2022, as Appl. No. 17/852,310.
Claims priority of provisional application 63/348,993, filed on Jun. 3, 2022.
Prior Publication US 2023/0396817 A1, Dec. 7, 2023
Int. Cl. H04N 21/23 (2011.01); H04N 21/234 (2011.01); G06V 20/40 (2022.01); G06T 7/246 (2017.01)
CPC H04N 21/23418 (2013.01) [G06T 7/246 (2017.01); G06V 20/46 (2022.01); G06T 2207/10021 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
a processor; and
a computer-readable medium storing instructions that are operative upon execution by the processor to:
receive a video stream comprising a plurality of video frames;
group the plurality of video frames into a set of present video frames and a set of historical video frames, the set of present video frames comprising a current video frame;
determine a set of attention weights for the set of historical video frames, the set of attention weights indicating how informative a video frame is for predicting action in the current video frame;
weight the set of historical video frames with the set of attention weights to produce a set of weighted historical video frames; and
based on at least the set of weighted historical video frames and the set of present video frames, generate an action prediction for the current video frame.