| CPC G10L 13/02 (2013.01) [G06F 40/58 (2020.01); G10L 15/26 (2013.01)] | 20 Claims |

|
1. A computer-implemented method, comprising:
receiving a source speech waveform, the source speech waveform including one or more words spoken by a source speaker;
generating source speaker characteristics associated with the source speaker based at least in part on the source speech waveform, wherein the source speaker characteristics comprise first speaker conditioning, first noise conditioning, and first style conditioning;
receiving a target speaker selection, the target speaker selection associated with target speaker characteristics, wherein the target speaker characteristics comprise second speaker conditioning, second noise conditioning, and second style conditioning; and
generating a target speech waveform based at least in part on the target speaker characteristics, wherein the target speech waveform includes at least a portion of the one or more words.
|