| CPC G06V 40/166 (2022.01) [G06V 40/172 (2022.01); G10L 15/25 (2013.01)] | 15 Claims |

|
1. A video image composition method, comprising:
receiving a priority level list, wherein the priority level list comprises a plurality of priority levels of a plurality of person identities;
receiving a plurality of video streams;
identifying a plurality of identity labels corresponding to a plurality of human face frame images in the video streams;
determining a plurality of display levels corresponding to the human face frame images, according to the identity labels and the priority level list;
detecting a part of the human face frame images being in speaking status;
constituting at least one of the part of the human face frame images being in speaking status as a main display area of a video image, according to the display levels; and
in a moderator mode, determining a first human face frame image from the human face frame images being speaking, wherein the first human face frame image is corresponding to a first identity label comprising a highest display priority order of the display levels.
|