| CPC G06V 40/161 (2022.01) [H04N 7/15 (2013.01)] | 15 Claims |

|
1. A method comprising:
obtaining, using a head detection model and for an image of a video stream, head detection information that identifies a plurality of heads detected in the image;
selecting a layout based on a number of the plurality of heads detected in the image;
identifying a set of templates corresponding to a layout;
creating, individually, a plurality of head frame definitions for the plurality of heads using the set of templates;
generating an image frame definition combining the plurality of head frame definitions;
obtaining, from the head detection information, a head detection bounding box for a head of the plurality of heads;
adding a buffer around the head detection bounding box;
defining, using the buffer and a template of the plurality of templates, a zoom amount and an alignment for a head frame definition; and
processing the video stream using the image frame definition,
wherein defining the zoom amount and the alignment comprises aligning the buffer to a top headline and bottom headline of the template.
|