US 12,033,614 B2
System and method for secure data augmentation for speech processing systems
Dushyant Sharma, Mountain House, CA (US); Patrick Aubrey Naylor, Reading (GB); and Francesco Nespoli, London (GB)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Nuance Communications, Inc., Burlington, MA (US)
Filed on Jun. 15, 2022, as Appl. No. 17/840,787.
Prior Publication US 2023/0410789 A1, Dec. 21, 2023
Int. Cl. G10L 13/08 (2013.01); G06F 21/62 (2013.01); G06F 40/166 (2020.01); G10L 13/033 (2013.01); G10L 17/02 (2013.01)
CPC G10L 13/08 (2013.01) [G06F 21/6245 (2013.01); G06F 40/166 (2020.01); G10L 13/033 (2013.01); G10L 17/02 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, executed on a computing device, comprising:
receiving an input speech signal;
receiving a transcription of the input speech signal;
extracting a speaker embedding from the input speech signal;
extracting acoustic properties from the input speech signal;
generating an obscured transcription from the transcription, wherein the obscured transcription includes obscured representations of sensitive content from the transcription;
generating an obscured speech signal based upon, at least in part, the extracted speaker embedding and the obscured transcription, wherein generating the obscured speech signal includes defining a synthetic speaker embedding by modifying the extracted speaker embedding using a speaker embedding modification model, wherein the obscured speech signal includes obscured representations of sensitive content from the input speech signal; and
augmenting the obscured speech signal based upon, at least in part, the extracted acoustic properties.