| CPC G06F 3/165 (2013.01) [G06V 20/41 (2022.01); G10L 25/51 (2013.01); H04S 5/005 (2013.01)] | 20 Claims |

|
1. A method of assigning spatial information to audio segments, comprising:
receiving first video frames that include a visual object;
receiving a first audio segment, wherein the first audio segment includes an auditory event associated with the visual object, and wherein the first audio segment is non-spatialized;
upon determining that second video frames do not include the visual object and that a first time difference between the first video frames and the second video frames are not longer than a certain time, using a motion vector of the visual object to assign a spatial location to the auditory event in at least one of the second video frames;
receiving a second audio segment, wherein the second audio segment includes the auditory event;
receiving third video frames, wherein the third video frames do not include the visual object;
upon determining that the third video frames do not include the visual object and that a second time difference between the first video frames and the third video frames are longer than the certain time, assigning the auditory event to a diffuse sound field; and
generating an audio output that includes spatial locations of the visual object to a listener.
|