| CPC H04N 7/15 (2013.01) [G06F 3/162 (2013.01); H04N 23/611 (2023.01); H04N 23/661 (2023.01); H04N 23/90 (2023.01); H04R 3/005 (2013.01)] | 23 Claims |

|
1. A method for selecting video and audio in a video conference system for a conference room comprising:
operating at least two or more smartphone cameras in the conference room to generate respective video streams;
transmitting the generated respective video streams;
generating video-associated metadata (VAM) by the respective smartphones regarding the transmitted video streams;
transmitting the generated video-associated metadata to at least one conference room transceiver;
receiving the generated video-associated metadata from each of the at least two smartphone cameras by at least one room processor transceiver;
generating audio data by at least two or more microphones located within the conference room, the microphones not communicatively coupled to any of the smartphones;
transmitting the generated audio data to the at least one conference room transceiver;
receiving the generated audio data from each of the at least two microphones by the at least one room processor transceiver;
generating an audio composite by a room processor communicatively coupled to the at least one room processor transceiver by combining all of the received audio data;
analyzing the received video-associated metadata by the room processor; and
selecting one of the video streams to be a selected video stream based on the analyzed video-associated metadata; and
transmitting the selected video stream and the audio composite to a remote endpoint.
|