CPC G10L 21/0208 (2013.01) [H04N 21/4852 (2013.01); H04N 21/8106 (2013.01)] | 17 Claims |
1. An apparatus comprising:
at least one processor; and
at least one non-transitory memory including computer program code, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus to at least the following:
receive multimedia data representing a scene, the multimedia data comprising at least audio data representing an audio component of the scene;
determine at least one location of unwanted sound in the scene, wherein the at least one determined location comprises one or more spatial locations in the scene where the unwanted sound is present;
perform first audio processing to remove at least part of the unwanted sound from the at least one determined location;
perform second audio processing to add artificial sound associated to the unwanted sound at or proximate the at least one determined location;
identify one or more regions of interest based on object classification; and
determine whether the at least one determined location of unwanted sound corresponds with the one or more regions of interest, wherein a correspondence between the at least one determined location of unwanted sound and the one or more regions of interest affects:
an amount of the unwanted sound removed via the first audio processing, and
an amount of the artificial sound added via the second audio processing.
|