| CPC G10L 21/013 (2013.01) [H04L 12/1831 (2013.01); G10L 2021/0135 (2013.01)] | 11 Claims |

|
1. A method for voice anonymization in an audio- or videoconferencing session, the method comprising:
receiving a plurality of input audio samples comprising speech;
calculating a frequency spectrum of each the plurality of input audio samples;
calculating a smoothed spectral magnitude envelope of a first of the plurality of frequency spectrums to determine a plurality of formant features of the speech, each of the plurality of formant features being located at different frequencies in the frequency spectrum;
determining one random scaling factor for the audio- or videoconferencing session;
determining, based on the one random scaling factor, a voice anonymization function shifting the formant location of at least one of the plurality of formants;
applying the voice anonymization function on the frequency spectrum of each the subsequent plurality of input audio samples in the audio- or videoconferencing session;
determining the one random scaling factor by using a random function to pick a number from two or more ranges of scaling factors;
wherein the voice anonymization function is a linear segment warping function performing linear scaling in the range 0-4 kHz;
the voice anonymization function is tapering off to zero warp at one half of a sampling frequency; and
determining a plurality of frequency gains for the voice anonymization function by calculating a ratio between the smoothed spectral magnitude envelope and a spectral magnitude envelope of the voice anonymization function applied on the smoothed spectral magnitude envelope.
|