US 11,676,581 B2
Method and apparatus for evaluating trigger phrase enrollment
Joel A. Clark, Woodridge, IL (US); Tenkasi V. Ramabadran, Oswego, IL (US); and Mark A. Jasiuk, Chicago, IL (US)
Assigned to Google Technology Holdings LLC, Mountain View, CA (US)
Filed by Google Technology Holdings LLC, Mountain View, CA (US)
Filed on Aug. 17, 2020, as Appl. No. 16/995,673.
Application 16/995,673 is a continuation of application No. 16/216,908, filed on Dec. 11, 2018, granted, now 10,777,190.
Application 16/216,908 is a continuation of application No. 15/612,693, filed on Jun. 2, 2017, granted, now 10,192,548, issued on Jan. 29, 2019.
Application 15/612,693 is a continuation of application No. 15/609,342, filed on May 31, 2017, granted, now 10,163,439, issued on Dec. 25, 2018.
Application 15/609,342 is a continuation of application No. 15/605,565, filed on May 25, 2017, granted, now 10,163,438, issued on Dec. 25, 2018.
Application 15/605,565 is a continuation of application No. 15/384,142, filed on Dec. 19, 2016, granted, now 10,170,105, issued on Jan. 1, 2019.
Application 15/384,142 is a continuation of application No. 14/050,596, filed on Oct. 10, 2013, granted, now 9,548,047, issued on Jan. 17, 2017.
Claims priority of provisional application 61/860,730, filed on Jul. 31, 2013.
Prior Publication US 2020/0380961 A1, Dec. 3, 2020
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/00 (2013.01); G10L 15/18 (2013.01); G10L 15/06 (2013.01); G10L 21/0264 (2013.01); G10L 25/84 (2013.01); G10L 15/08 (2006.01); G10L 15/20 (2006.01)
CPC G10L 15/1807 (2013.01) [G10L 15/063 (2013.01); G10L 21/0264 (2013.01); G10L 25/84 (2013.01); G10L 15/20 (2013.01); G10L 2015/088 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A method of training a trigger phrase model, the method comprising:
during a trigger phrase enrollment process:
receiving, at a speech recognition-enabled electronic device associated a user, audio corresponding to the user speaking a trigger phrase; and
based on a count of a number of frames in the audio that have a measure of noise variability of background noise exceeding a noise variability threshold satisfying a threshold value, training, by the speech recognition-enabled electronic device, the trigger phrase model to both:
adapt to a voice of the user of the speech recognition-enabled device using the audio corresponding to the user speaking the trigger phrase; and
detect the trigger phrase in utterances spoken by the user using the audio corresponding to the user speaking the trigger phrase,
wherein the speech recognition-enabled electronic device, while in a sleep mode, is configured to use the trigger phrase model trained during the trigger phrase enrollment process to:
reject the trigger phrase when spoken in utterances by people other than the user of the speech recognition-enabled electronic device; and
wake from the sleep mode when the trigger phrase is spoken in utterances by the user of the speech recognition-enabled electronic device, the sleep mode comprising a power-saving mode of operation in which one or more parts of the speech recognition-enabled electronic device are in a low-power state or powered off.