US 11,783,826 B2
System and method for data augmentation and speech processing in dynamic acoustic environments
Patrick A. Naylor, Reading (GB); Dushyant Sharma, Woburn, MA (US); Uwe Helmut Jost, Groton, MA (US); and William F. Ganong, III, Brookline, MA (US)
Assigned to Nuance Communications, Inc., Burlington, MA (US)
Filed by Nuance Communications, Inc., Burlington, MA (US)
Filed on Feb. 18, 2021, as Appl. No. 17/178,785.
Prior Publication US 2022/0262357 A1, Aug. 18, 2022
Int. Cl. G10L 15/22 (2006.01); G10L 15/06 (2013.01); G10L 21/0332 (2013.01); G10L 21/0224 (2013.01); G10L 21/0216 (2013.01)
CPC G10L 15/22 (2013.01) [G10L 15/063 (2013.01); G10L 21/0224 (2013.01); G10L 21/0332 (2013.01); G10L 2021/02166 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method, executed on a computing device, comprising:
receiving one or more inputs indicative of at least one of:
a relative location of a speaker and a microphone array, and
a relative orientation of the speaker and the microphone array;
receiving one or more reference signals; and
training a speech processing system using the one or more inputs and the one or more reference signals, wherein training the speech processing system using the one or more inputs and the one or more reference signals includes training a plurality of speech processing models for a plurality of acoustic variations associated with the one or more reference signals;
receiving one or more run-time inputs indicative of at least one of:
the relative location of the speaker and the microphone array, and
the relative orientation of the speaker and the microphone array;
receiving a speech signal via the microphone array; and
performing speech processing via the trained speech processing system using the one or more run-time inputs and the speech signal.