CPC G06V 40/10 (2022.01) [G06T 7/20 (2013.01); G06T 7/70 (2017.01); G06V 10/25 (2022.01); G06V 20/49 (2022.01); G10L 17/00 (2013.01); G10L 25/57 (2013.01); H04N 5/2628 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20132 (2013.01); G06T 2207/30196 (2013.01); H04R 1/406 (2013.01); H04R 3/005 (2013.01)] | 13 Claims |
1. A method for framing video, comprising:
associating a current region-of-interest (ROI) corresponding to video of a scene being imaged;
detecting audio signals associated with the scene;
determining active speakers in the scene based on the detected audio signals;
performing a reframing operation comprising dynamically calculating a target region of interest (ROI), wherein
the target ROI is generated based on coordinates associated with a bounding box calculated for the determined active speakers in the scene, and
the target ROI is expanded to include a non-speaker when the target ROI crops the non-speaker;
calculating a degree of overlap between the current ROI and the target ROI, wherein the calculating the degree of overlap between the current ROI of the target ROI comprises determining an intersection over union (IoU) between the current ROI and the target ROI; and
based on the calculated degree of overlap, transitioning to the target ROI from the current ROI using one of a cutover transition technique and a smooth transition technique.
|