CPC G10L 21/0364 (2013.01) [G06N 3/04 (2013.01); G06N 3/084 (2013.01); G06N 3/086 (2013.01); G10L 21/04 (2013.01); G10L 25/30 (2013.01); H04L 1/0045 (2013.01); H04L 1/08 (2013.01)] | 20 Claims |
1. A method comprising:
receiving digital audio data generated at a first client device, the digital audio data representing at least a target audio signal and a noise signal;
segmenting the digital audio data into overlapping windows for enhancement processing using a machine learning (ML) model, wherein a predetermined amount of a window overlaps with adjacent windows that occur before and after the window;
analyzing the window of the digital audio data using the ML model, wherein the ML model is trained to reduce a representation of the noise signal in the window of the digital audio data;
combining the overlapping window segments by averaging results of the analyzing of each window to reduce audio artifacts;
obtaining, based at least in part on the analyzing and combining, enhanced digital audio data that represents at least the target audio signal; and
sending the enhanced digital audio data to a second client device.
|