| CPC G10L 25/78 (2013.01) [G06T 7/248 (2017.01); G10L 21/0272 (2013.01); H04R 3/005 (2013.01)] | 16 Claims |

|
1. A method performed by a processor of a device having a plurality of microphones, comprising:
receiving a plurality of audio signals from the plurality of microphones, the plurality of microphones capturing a sound field;
processing the audio signals into a plurality of frequency domain signals;
extracting, from the frequency domain signals, a primary speech signal;
extracting, from the frequency domain signals, one or more ambience audio signals;
generating one or more spatial parameters defining spatial characteristics of an ambience sound in the one or more ambience audio signals, the one or more spatial parameters include a location of an ambience sound source;
detecting a location or an orientation of the device, as tracking data;
modifying the one or more spatial parameters based on the tracking data by offsetting a relative movement of the ambience sound source, the relative movement caused by a change in the location or orientation of the device, to maintain a location of the ambience sound source constant during playback; and
encoding the primary speech signal, the one or more ambience audio signals, and the as modified spatial parameters into one or more encoded data streams.
|