US 11,736,801 B2
Merging webcam signals from multiple cameras
Tom Bushman, Marblehead, MA (US); Ilya Moskovko, Campbell, CA (US); and Howard Brown, Arlington, MA (US)
Assigned to OWL LABS INC., Boston, MA (US)
Filed by Owl Labs Inc., Somerville, MA (US)
Filed on Aug. 24, 2021, as Appl. No. 17/411,016.
Claims priority of provisional application 63/069,710, filed on Aug. 24, 2020.
Prior Publication US 2022/0070371 A1, Mar. 3, 2022
Int. Cl. H04N 23/698 (2023.01); G06F 3/01 (2006.01); G06F 3/16 (2006.01); H04N 5/06 (2006.01); H04N 23/611 (2023.01); H04N 23/661 (2023.01)
CPC H04N 23/698 (2023.01) [G06F 3/013 (2013.01); G06F 3/165 (2013.01); H04N 5/06 (2013.01); H04N 23/611 (2023.01); H04N 23/661 (2023.01)] 24 Claims
OG exemplary drawing
 
1. A system comprising:
a processor;
a camera operatively coupled to the processor and configured to capture a first panorama view;
an audio sensor system operatively coupled to the processor and configured to capture audio corresponding to the first panorama view;
a first communication interface operatively coupled to the processor; and
a memory storing computer-readable instructions that, when executed, cause the processor to:
determine a first bearing of a person within the first panorama view,
determine a first gaze direction of the person within the first panorama view,
receive, from an external source and via the first communication interface, a second panorama view,
receive, from the external source via the first communication interface, a second bearing of the person within the second panorama view,
receive, from the external source via the first communication interface, a second gaze direction of the person within the second panorama view,
compare the first gaze direction and the second gaze direction,
select, based on comparing the first gaze direction and the second gaze direction, a selected panorama view from between the first panorama view and the second panorama view,
select, based on the selected panorama view, a selected bearing of the person from between the first bearing of the person and the second bearing of the person,
form a localized subscene video signal based on the selected panorama view along the selected bearing of the person,
generate a stage view signal based on the localized subscene video signal,
generate a scaled panorama view signal based on the first panorama view or the second panorama view,
composite a composited signal comprising the scaled panorama view signal and the stage view signal,
receive audio corresponding to the second panorama view,
detect an error in the audio corresponding to the second panorama view by finding a missing audio data of the audio corresponding to the second panorama view,
conceal the detected error in the audio corresponding to the second panorama view by replacing the missing audio data,
synchronize the audio corresponding to the first panorama view and the audio corresponding to the second panorama view,
merge the audio corresponding to the first panorama view and the audio corresponding to the second panorama view into a merged audio signal,
composite the merged audio signal with the composited signal, and
transmit the composited signal.