US 11,837,220 B2
Apparatus and method for speech processing using a densely connected hybrid neural network
Minje Kim, Indianapolis, IN (US); Mi Suk Lee, Daejeon (KR); Seung Kwon Beack, Daejeon (KR); Jongmo Sung, Daejeon (KR); Tae Jin Lee, Daejeon (KR); Jin Soo Choi, Daejeon (KR); and Kai Zhen, Indianapolis, IN (US)
Assigned to Electronics and Telecommunications Research Institute, Daejeon (KR); and The Trustees of Indiana University, Indianapolis, IN (US)
Filed by Electronics and Telecommunications Research Institute, Daejeon (KR); and The Trustees of Indiana University, Indianapolis, IN (US)
Filed on May 5, 2021, as Appl. No. 17/308,800.
Claims priority of application No. 10-2020-0054733 (KR), filed on May 7, 2020.
Prior Publication US 2021/0350796 A1, Nov. 11, 2021
Int. Cl. G10L 15/16 (2006.01); G06F 17/15 (2006.01); G06N 3/045 (2023.01)
CPC G10L 15/16 (2013.01) [G06F 17/15 (2013.01); G06N 3/045 (2023.01)] 18 Claims
OG exemplary drawing
 
1. A speech processing method, comprising:
inputting a time domain sample of N*1 dimension for an input speech into a densely connected hybrid network, N being an integer, greater than one;
passing the time domain sample through a plurality of dense blocks in a densely connected hybrid network;
reshaping the time domain samples into M subframes by passing the time domain samples through the plurality of dense blocks;
inputting the M subframes into gated recurrent unit (GRU) components of N/M-dimension;
outputting clean speech from which noise is removed from the input speech by passing the M subframes through GRU components
wherein the densely connected hybrid network is combination a convolutional neural network (CNN) and a recurrent neural network (RNN).