CPC G10L 15/26 (2013.01) [G10L 13/04 (2013.01); G10L 15/063 (2013.01)] | 20 Claims |
1. A computer-implemented method, executed on a computing device, comprising:
generating an obscured speech signal from an input speech signal and an obscured transcription from a transcription of the input speech signal, wherein the obscured speech signal and the obscured transcription include obscured representations of sensitive content from the input speech signal and the transcription of the input speech signal;
extracting a speaker embedding from the input speech signal;
generating a speaker embedding delta based upon, at least in part, the extracted speaker embedding and a synthetic speaker embedding;
generating a synthetic speech signal from the obscured speech signal using the synthetic speaker embedding;
generating a residual signal based upon, at least in part, the obscured speech signal and the speaker embedding delta; and
training a speech processing system using the obscured transcription, the synthetic speech signal, the speaker embedding delta, and the residual signal.
|