| CPC H04S 7/302 (2013.01) [G06T 7/246 (2017.01); G11B 27/10 (2013.01); H04S 3/008 (2013.01); H04S 7/307 (2013.01); G06T 2207/10016 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); H04S 2400/01 (2013.01); H04S 2400/11 (2013.01); H04S 2400/15 (2013.01); H04S 2420/11 (2013.01)] | 20 Claims |

|
1. A video processing apparatus comprising:
a memory storing at least one instruction; and
at least one processor configured to execute the at least one instruction to:
generate a plurality of feature information for time and frequency by analyzing a video signal comprising a plurality of images, based on a first deep neural network (DNN);
extract a first altitude component corresponding to a movement of an object in a vertical direction in a video and a first planar component corresponding to the movement of the object in a horizontal direction in the video from the video signal, based on a second DNN;
extract a second planar component corresponding to a movement of a sound source in the horizontal direction in audio from a first audio signal, based on a third DNN;
generate a second altitude component corresponding to the movement of the sound source in the vertical direction in the audio based on the first altitude component, the first planar component, and the second planar component;
output a second audio signal comprising the second altitude component, based on the plurality of feature information; and
synchronize the second audio signal with the video signal and output the synchronized second audio signal and video signal.
|