US 11,902,548 B2
	Systems, methods and computer media for joint attention video processing
Deepak Sridhar, Richmond Hill (CA); Niamul Quader, Toronto (CA); Srikanth Muralidharan, Thornhill (CA); Yaoxin Li, Montreal (CA); Juwei Lu, North York (CA); and Peng Dai, Markham (CA)
Assigned to HUAWEI TECHNOLOGIES CO., LTD., Shenzhen (CN)
Filed by Deepak Sridhar, Richmond Hill (CA); Niamul Quader, Toronto (CA); Srikanth Muralidharan, Thornhill (CA); Yaoxin Li, Montreal (CA); Juwei Lu, North York (CA); and Peng Dai, Markham (CA)
Filed on Mar. 16, 2021, as Appl. No. 17/203,613.
Prior Publication US 2022/0303560 A1, Sep. 22, 2022
Int. Cl. G06N 3/045 (2023.01); H04N 19/20 (2014.01); G06N 3/082 (2023.01)

CPC H04N 19/20 (2014.11) [G06N 3/045 (2023.01); G06N 3/082 (2013.01)]

20 Claims

1. A computer implemented method for processing a video, the method comprising:

receiving a plurality of video frames of the video;

generating a plurality of first input features based on the plurality of video frames;

generating a plurality of second input features based on reversing a temporal order of the plurality of first input features;

generating a first set of joint attention features based on the plurality of first input features;

generating a second set of joint attention features based on the plurality of second input features; and

concatenating the first set of joint attention features and the second set of joint attention features to generate a final set of joint attention features.