US 12,142,018 B2
	Video stitching method and system
Fabrizio Moggio, Turin (IT); Nicola Reale, Turin (IT); Andrea Varesio, Turin (IT); and Marco Vecchietti, Turin (IT)
Assigned to Telecom Italia S.p.A., Milan (IT)
Appl. No. 17/777,537
Filed by Telecom Italia S.p.A., Milan (IT)
PCT Filed Nov. 10, 2020, PCT No. PCT/EP2020/081594 § 371(c)(1), (2) Date May 17, 2022, PCT Pub. No. WO2021/099178, PCT Pub. Date May 27, 2021.
Claims priority of application No. 102019000021399 (IT), filed on Nov. 18, 2019.
Prior Publication US 2024/0029386 A1, Jan. 25, 2024
Int. Cl. G06V 10/10 (2022.01); G06T 7/194 (2017.01); G06V 10/25 (2022.01); H04N 5/272 (2006.01)

CPC G06V 10/16 (2022.01) [G06T 7/194 (2017.01); G06V 10/25 (2022.01); H04N 5/272 (2013.01)]

20 Claims

1. A video communication system adapted to be interfaced with a plurality of video cameras for receiving respective video signals therefrom, comprising:

one more computing devices having memory storing application software that, when executed, causes the video communication system to:

extract from each video signal received from the video cameras a corresponding sequence of video frames, each sequence of video frames comprising a first sequence portion comprising background video frames shooting background only and a subsequent second sequence portion comprising video frames shooting a foreground subject;

receive the video frames of the second sequence portion of each sequence of video frames;

every time new video frames of the second sequence portion of each sequence of video frames are received:

select a corresponding dominant video camera among the plurality of video cameras based on the received new video frames, the dominant video camera being the video camera having a best point of view of the foreground subject, and

process the received new video frames to generate corresponding operative seam masks to be used for stitching together the new video frames;

receive the background video frames of the first sequence portion of each sequence of video frames and to generate for each video camera a corresponding set of background seam masks according to the received background video frames, wherein:

each seam mask among the operative seam masks and background seam masks has a respective area and comprises a graph cut subdividing area of the seam mask into:

remove area portions defining, when the seam mask is superimposed on a video frame, corresponding area portions of the video frame to be cut out for being removed, and

keep area portions defining, when the seam mask is superimposed on a video frame, corresponding area portions of the video frame to be kept,

each set of background seam masks corresponding to a video camera comprises background seam masks to be used for stitching together video frames of the second sequence portion when the video camera is selected as the dominant video camera, wherein each set of background seam masks is generated through a graph cut procedure providing for:

scanning overlapping pixels in the background video frames, and

calculating the graph cut which causes lowest junction distortions through minimization of a cost function regarding a Euclidean distance among pixels astride the graph cut;

every time new video frames of the second sequence portion of each sequence of video frames are received:

select the set of background seam masks corresponding to the dominant video camera;

generate combined masks by combining the background seam masks of the selected set with the operative seam masks; and

generate a panoramic video frame by stitching together the received new video frames by removing therefrom area portions using the combined masks.