| CPC G10L 17/04 (2013.01) [G06N 3/04 (2013.01); G10L 17/18 (2013.01)] | 20 Claims |

|
1. A computer-implemented method comprising:
obtaining, by a computer, a plurality of training audio signals including one or more lower-bandwidth audio signals having a first bandwidth and one or more corresponding higher-bandwidth audio signals having a second bandwidth, wherein the first bandwidth is comparatively lower than the second bandwidth;
training, by the computer, a bandwidth expander comprising a first set of one or more neural network layers of a neural network and an embedding extractor comprising a second set of one or more neural network layers of the neural network using a plurality of joint labels for joint training by:
applying the neural network on the plurality of training audio signals and the plurality of labels associated with the plurality of training audio signals, wherein the plurality of labels indicate features of the plurality of training audio signals, and wherein the computer determines that the bandwidth expander is trained in response to determining that a level of error between a predicted feature vector extracted by the bandwidth expander and an expected feature vector indicated by one or more labels of a training audio signal satisfies a training threshold;
receiving, by the computer, an inbound audio signal having the first bandwidth; and
generating, by the computer, an estimated inbound audio signal having the second bandwidth by applying the bandwidth expander of the neural network on the inbound audio signal.
|