| CPC G10L 21/0264 (2013.01) [G10L 21/04 (2013.01); G10L 25/18 (2013.01); G10L 19/00 (2013.01); G10L 25/90 (2013.01)] | 17 Claims |

|
1. A method of audio processing comprising:
generating a frame by sampling audio input in increments, which are based on a first buffer size associated with an input/output buffer of a host device, until a threshold buffer size, corresponding to a frame size used to train a machine learning model, is reached, wherein the first buffer size does not match the threshold buffer size;
extracting, from the frame, amplitude information, pitch information, and pitch status information;
determining, by the machine learning model, control information for audio reproduction based on the amplitude information, the pitch information, and the pitch status information, the control information including pitch control information and noise magnitude control information;
generating filtered noise information by inverting the noise magnitude control information using an overlap and add technique, including:
receiving the noise magnitude control information according to the frame size from the machine learning model;
rendering the filtered noise information in a block size not equal to the frame size;
writing, via the overlap and add technique, the filtered noise information to a circular buffer; and
reading, in the first buffer size, the filtered noise information from the circular buffer;
generating, based on the pitch control information, additive harmonic information by combining a plurality of scaled wavetables; and
rendering audio output based on the filtered noise information and the additive harmonic information.
|