US 11,671,551 B2
Synchronization of multi-device image data using multimodal sensor data
Jason Vandekieft, Madison, WI (US); Nikolaos Georgis, San Diego, CA (US); and Chad Goolbis, Madison, WI (US)
Assigned to SONY GROUP CORPORATION, Tokyo (JP)
Filed by SONY GROUP CORPORATION, Tokyo (JP)
Filed on May 24, 2021, as Appl. No. 17/328,920.
Prior Publication US 2022/0377208 A1, Nov. 24, 2022
Int. Cl. H04N 5/04 (2006.01); G10L 25/18 (2013.01); G10L 25/51 (2013.01); G06F 18/22 (2023.01); H04N 23/80 (2023.01); H04N 23/90 (2023.01)
CPC H04N 5/04 (2013.01) [G06F 18/22 (2023.01); G10L 25/18 (2013.01); G10L 25/51 (2013.01); H04N 23/80 (2023.01); H04N 23/90 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
circuitry configured to:
receive, from a plurality of image-capture devices, image data comprising a plurality of image sequences of a first object, wherein each image sequence of the plurality of image sequences corresponds to an image-capture device of the plurality of image-capture devices;
receive a set of sensor data from the plurality of image-capture devices,
wherein each sensor data of the received set of sensor data comprises at least one of an Inertial Measurement Unit (IMU) data or audio data, and
each sensor data of the received set of sensor data is associated with a duration of acquisition of a corresponding image sequence of the plurality of image sequences;
extract, from the received set of sensor data, a first IMU data corresponding to a first image sequence of the plurality of image sequences;
generate a first spectrogram of the first IMU data;
filter the generated first spectrogram based on one of a first two-dimensional (2D) diamond kernel or a first masked max filter to generate a first filter result;
convert the first filter result to a first list of one of time-domain values or frequency-domain values;
generate a first lookup key with first offset values to neighboring list elements of the first list;
determine, based on the generated first lookup key, a match between a first set of image frames of the first image sequence and a second set of image frames of a second image sequence of the plurality of image sequences;
compute an offset between the first set of image frames and the second set of image frames, based on the match; and
synchronize the first image sequence with the second image sequence based on the computed offset.