US 12,014,562 B2
Method and system for automatic speaker framing in video applications
Morten Smidt Proschowsky, Værløse (DK); Sui Kun Guan, Union City, CA (US); Nihit Rajendra Save, Mountain View, CA (US); and Aurangzeb Khan, Portola Valley, CA (US)
Assigned to GN AUDIO A/S, (DK)
Filed by GN Audio A/S, Ballerup (DK)
Filed on Feb. 24, 2021, as Appl. No. 17/184,558.
Prior Publication US 2022/0269882 A1, Aug. 25, 2022
Int. Cl. G06K 9/00 (2022.01); G06T 7/20 (2017.01); G06T 7/70 (2017.01); G06V 10/25 (2022.01); G06V 20/40 (2022.01); G06V 40/10 (2022.01); G10L 17/00 (2013.01); G10L 25/57 (2013.01); H04N 5/262 (2006.01); H04R 1/40 (2006.01); H04R 3/00 (2006.01)
CPC G06V 40/10 (2022.01) [G06T 7/20 (2013.01); G06T 7/70 (2017.01); G06V 10/25 (2022.01); G06V 20/49 (2022.01); G10L 17/00 (2013.01); G10L 25/57 (2013.01); H04N 5/2628 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20132 (2013.01); G06T 2207/30196 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method for framing video, comprising:
associating a current region-of-interest (ROI) corresponding to video of a scene being imaged;
detecting audio signals associated with the scene;
determining active speakers in the scene based on the detected audio signals;
performing a reframing operation comprising dynamically calculating a target region of interest (ROI), wherein
the target ROI is generated based on coordinates associated with a bounding box calculated for the determined active speakers in the scene, and
the target ROI is expanded to include a non-speaker when the target ROI crops the non-speaker;
calculating a degree of overlap between the current ROI and the target ROI, wherein the calculating the degree of overlap between the current ROI of the target ROI comprises determining an intersection over union (IoU) between the current ROI and the target ROI; and
based on the calculated degree of overlap, transitioning to the target ROI from the current ROI using one of a cutover transition technique and a smooth transition technique.