| CPC G10L 15/22 (2013.01) [G10L 15/063 (2013.01); G10L 2015/0631 (2013.01); G10L 2015/223 (2013.01)] | 24 Claims |

|
1. A method comprising:
accessing, using at least one processor of an electronic device, a machine learning model, wherein the machine learning model is a trained student model that is trained using audio samples in a plurality of accent types, and wherein the trained student model is trained to detect a wake word using information distilled from a plurality of trained teacher keyword detector models, each trained teacher keyword detector model associated with a different accent type of the plurality of accent types;
receiving, using the at least one processor, an audio input from an audio input device;
providing, using the at least one processor, the audio input to the trained student model;
calculating, using the trained student model, frame-level probabilities associated with the audio input;
receiving, using the at least one processor, an output from the trained student model including the frame-level probabilities associated with the audio input; and
instructing, using the at least one processor, at least one action based on the frame-level probabilities associated with the audio input.
|