CPC G10L 15/16 (2013.01) [G06F 17/15 (2013.01); G06N 3/045 (2023.01)] | 18 Claims |
1. A speech processing method, comprising:
inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network, N being an integer, greater than one;
passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network;
reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks;
inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension;
outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components
wherein the densely connected hybrid network is combination a convolutional neural network (CNN) and a recurrent neural network (RNN).
|