CPC G10L 15/063 (2013.01) [G10L 15/02 (2013.01); G10L 25/84 (2013.01)] | 17 Claims |
1. A method comprising:
training a model using audio recordings from noise scenarios in a set of training data;
decomposing a training signal from the set of training data into a message component and a noise component;
scaling the noise component by a random scale factor to obtain a scaled noise, wherein the random scale factor is a power with a base that is a constant and an exponent that includes a random variable;
adding the scaled noise to the message component to obtain a perturbed audio signal that is included in the set of training data;
training a first teacher model using a first subset of the set of training data associated with a first noise scenario of the noise scenarios;
training a second teacher model using a second subset of the set of training data associated with a second noise scenario of the noise scenarios; and
training a student model using soft labels output from the first teacher model and soft labels output from the second teacher model.
|