CPC G10L 21/0232 (2013.01) [G10L 25/12 (2013.01); G10L 25/18 (2013.01); G10L 25/21 (2013.01); G10L 25/30 (2013.01); G10L 2021/02082 (2013.01)] | 20 Claims |
1. A speech signal dereverberation processing method, executed by at least one processor, the method comprising:
extracting an amplitude spectrum feature and a phase spectrum feature of a current frame in an original speech signal;
extracting subband amplitude spectrums from the amplitude spectrum feature corresponding to the current frame;
determining, based on the subband amplitude spectrums and a reverberation strength distribution associated with the current frame and by using a first model, a reverberation strength indicator corresponding to the current frame, the first model being a first neural network model that is trained using reverberated band amplitude spectrum, clean speech band amplitude spectrum, and a reverberation-to-clean-speech energy ratio, with the reverberation-to-clean-speech energy ratio used as a training target;
determining, based on the subband amplitude spectrums and the reverberation strength indicator, and by using a second model, a clean speech subband spectrum corresponding to the current frame, wherein the second model is a regressive reverberation strength prediction algorithm model based on a history frame; and
obtaining a dereverberated clean speech signal by performing signal conversion on the clean speech subband spectrum and the phase spectrum feature corresponding to the current frame.
|