| CPC G10L 21/0232 (2013.01) [G10L 21/0224 (2013.01); G10L 21/0272 (2013.01); G10L 21/0308 (2013.01); G10L 25/09 (2013.01); G10L 25/18 (2013.01); G10L 25/21 (2013.01); G10L 25/84 (2013.01)] | 16 Claims |

|
1. A non-transitory computer readable medium comprising program instructions executable by at least one processor to cause the at least one processor to perform a method comprising:
obtaining a first audio sample;
determining that a first portion of the first audio sample contains frequency content at frequencies higher than 5.6 kilohertz that exceeds a threshold energy level;
responsive to determining that the first portion contains frequency content at frequencies higher than 5.6 kilohertz that exceeds the threshold energy level, determining a first audio filter based on the first portion of the first audio sample by:
determining a first spectrogram for the first portion; and
performing non-negative matrix factorization to generate a first matrix and a second matrix whose product corresponds to a low-frequency portion of the first spectrogram that is below a threshold frequency, wherein the first matrix is composed of a set of column vectors that span along a frequency dimension of the first spectrogram, and wherein the second matrix is composed of a set of row vectors that span along a time dimension of the first spectrogram;
subsequent to obtaining the first audio sample, obtaining a second audio sample; and
applying the first audio filter to the second audio sample to generate a first audio output by:
determining a second spectrogram for the second audio sample;
applying the first matrix to a low-frequency portion of the second spectrogram that is below the threshold frequency to generate a third spectrogram that represents noise content of the second audio sample; and
using the third spectrogram to remove the noise content from the second audio sample, thereby generating the first audio output.
|