US 12,424,232 B2
Reduced latency streaming dynamic noise suppression using convolutional neural networks
Adam Kupryjanow, Gdansk PM (PL); and Lukasz Pindor, Gdansk Pomorskie (PL)
Assigned to Intel Corporation, Santa Clara, CA (US)
Filed by Intel Corporation, Santa Clara, CA (US)
Filed on Nov. 26, 2021, as Appl. No. 17/535,759.
Claims priority of provisional application 63/252,687, filed on Oct. 6, 2021.
Prior Publication US 2022/0084535 A1, Mar. 17, 2022
Int. Cl. G10L 21/0208 (2013.01); G06N 3/08 (2023.01); G10L 21/0232 (2013.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01); H04R 3/04 (2006.01)
CPC G10L 21/0208 (2013.01) [G06N 3/08 (2013.01); G10L 21/0232 (2013.01); G10L 25/30 (2013.01); G10L 25/78 (2013.01); H04R 3/04 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A dynamic noise suppression system, the system comprising:
an encoder circuit to generate a magnitude spectrum and a phase spectrum of an input audio signal, the input audio signal comprising speech and dynamic noise;
a separator circuit comprising a temporal convolution network (TCN) to generate a separation mask based on the magnitude spectrum, wherein the TCN comprises one or more depth-wise (DW) convolution layers, each DW convolution layer including a state buffer to store a number of previous states of the associated DW convolution layer, the number of stored previous states based on a dilation factor of the associated DW convolution layer, wherein each DW convolution layer is configured to use one or more of the previous states stored in the state buffer of the associated DW convolutional layer to generate an output;
a mixer to multiply the separation mask with the magnitude spectrum to separate the speech from the dynamic noise to obtain a denoised magnitude spectrum; and
a decoder circuit to reconstruct the input audio signal with reduced dynamic noise, based on the denoised magnitude spectrum and the phase spectrum.