US 12,136,412 B2
Training keyword spotters
Matthew Sharifi, Kilchberg (CH); Kevin Kilgour, Mountain View, CA (US); Dominik Roblek, Meilen (CH); and James Lin, Mountain View, CA (US)
Assigned to Google LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on May 4, 2022, as Appl. No. 17/662,021.
Application 17/662,021 is a continuation of application No. 16/717,518, filed on Dec. 17, 2019, granted, now 11,341,954.
Prior Publication US 2022/0262345 A1, Aug. 18, 2022
Int. Cl. G10L 15/22 (2006.01); G06N 3/04 (2023.01); G06N 3/08 (2023.01); G10L 13/00 (2006.01); G10L 15/06 (2013.01); G10L 15/16 (2006.01); G10L 15/08 (2006.01)
CPC G10L 15/063 (2013.01) [G06N 3/04 (2013.01); G06N 3/08 (2013.01); G10L 13/00 (2013.01); G10L 15/16 (2013.01); G10L 15/22 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A computer-implemented method when executed on data processing hardware of a user device causes the data processing hardware to perform operations comprising:
capturing a first set of training audio samples spoken by a user of the user device, each training audio sample containing a custom hotword, the custom hotword comprising one or more words;
obtaining a pre-trained model, the pre-trained model trained by a remote system in communication with the user device;
training, using the pre-trained model, a custom hotword model on the first set of training audio samples to learn how to detect a presence of the custom hotword in audio data;
receiving streaming audio data captured by a user device;
determining, using the trained custom hotword model, whether the custom hotword is present in the streaming audio data; and
when the custom hotword is present in the streaming audio data, initiating a wake-up process on the user device for processing the custom hotword and/or one or more other terms following the custom hotword in the streaming audio data.