US 11,902,548 B2
Systems, methods and computer media for joint attention video processing
Deepak Sridhar, Richmond Hill (CA); Niamul Quader, Toronto (CA); Srikanth Muralidharan, Thornhill (CA); Yaoxin Li, Montreal (CA); Juwei Lu, North York (CA); and Peng Dai, Markham (CA)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by Deepak Sridhar, Richmond Hill (CA); Niamul Quader, Toronto (CA); Srikanth Muralidharan, Thornhill (CA); Yaoxin Li, Montreal (CA); Juwei Lu, North York (CA); and Peng Dai, Markham (CA)
Filed on Mar. 16, 2021, as Appl. No. 17/203,613.
Prior Publication US 2022/0303560 A1, Sep. 22, 2022
Int. Cl. G06N 3/045 (2023.01); H04N 19/20 (2014.01); G06N 3/082 (2023.01)
CPC H04N 19/20 (2014.11) [G06N 3/045 (2023.01); G06N 3/082 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer implemented method for processing a video, the method comprising:
receiving a plurality of video frames of the video;
generating a plurality of first input features based on the plurality of video frames;
generating a plurality of second input features based on reversing a temporal order of the plurality of first input features;
generating a first set of joint attention features based on the plurality of first input features;
generating a second set of joint attention features based on the plurality of second input features; and
concatenating the first set of joint attention features and the second set of joint attention features to generate a final set of joint attention features.