| CPC G10L 21/0232 (2013.01) [G10L 15/02 (2013.01); G10L 15/063 (2013.01); G10L 25/18 (2013.01); G10L 2021/02166 (2013.01)] | 20 Claims |

|
9. An apparatus comprising:
at least one processing device configured to:
obtain noisy speech signals;
extract acoustic features from the noisy speech signals;
receive a predicted speech mask from a speech mask prediction model based on a first subset of the acoustic features;
receive a predicted noise mask from a noise mask prediction model based on a second subset of the acoustic features;
provide predicted speech features determined using the predicted speech mask and predicted noise features determined using the predicted noise mask to a filtering mask prediction model; and
generate a clean speech signal using a predicted filtering mask output by the filtering mask prediction model.
|