US 12,192,599 B2
Asynchronous content analysis for synchronizing audio and video streams
Indervir Singh Banipal, San Jose, CA (US); Shikhar Kwatra, San Jose, CA (US); Vijay Ekambaram, Chennai (IN); and Hemant Kumar Sivaswamy, Pune (IN)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jun. 12, 2023, as Appl. No. 18/333,132.
Prior Publication US 2024/0414418 A1, Dec. 12, 2024
Int. Cl. H04N 21/8547 (2011.01); H04N 21/233 (2011.01); H04N 21/234 (2011.01); H04N 21/242 (2011.01)
CPC H04N 21/8547 (2013.01) [H04N 21/233 (2013.01); H04N 21/23418 (2013.01); H04N 21/242 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
performing, using a video classifier, video reference point classification of a video stream based on an audio-video dataset; wherein the audio-video dataset comprises audio-video data of objects causing a Doppler-effect sound in a corresponding audio stream;
performing, using an audio classifier, audio reference point classification of the corresponding audio stream based on the audio-video dataset;
identifying, based on the video reference point classification, a set of video segments comprising object related reference points in the video stream; wherein the object related reference points in the video stream include the Doppler-effect sound in the audio stream;
identifying, based on the audio reference point classification, a set of audio segments comprising object related reference points of the Doppler-effect sound in the audio stream;
correlating object related reference points in the video segments of the video stream and in the audio segments of the audio stream to identify a set of audio-video synchronization candidates; and
comparing context of the set of the audio-video synchronization candidates to identify an audio-video synchronization candidate to synchronize the audio stream and the video stream.