US 11,900,948 B1
Automatic speaker identification using speech recognition features
Hugh Evan Secker-Walker, Newburyport, MA (US); Baiyang Liu, Bellevue, WA (US); and Frederick Victor Weber, New York, NY (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jan. 7, 2022, as Appl. No. 17/571,127.
Application 17/571,127 is a continuation of application No. 15/929,795, filed on May 21, 2020, granted, now 11,222,639.
Application 15/929,795 is a continuation of application No. 16/448,788, filed on Jun. 21, 2019, granted, now 10,665,245, issued on May 26, 2020.
Application 16/448,788 is a continuation of application No. 15/420,018, filed on Jan. 30, 2017, granted, now 10,332,525, issued on Jun. 25, 2019.
Application 15/420,018 is a continuation of application No. 13/957,257, filed on Aug. 1, 2013, granted, now 9,558,749, issued on Jan. 31, 2017.
Int. Cl. G10L 15/22 (2006.01); G10L 15/00 (2013.01); G10L 17/00 (2013.01); G10L 17/06 (2013.01); G10L 17/12 (2013.01); G10L 17/02 (2013.01); G10L 17/16 (2013.01); G10L 15/18 (2013.01); G10L 17/22 (2013.01); G10L 15/20 (2006.01); G10L 15/26 (2006.01); G10L 15/02 (2006.01); G10L 15/08 (2006.01)
CPC G10L 17/06 (2013.01) [G10L 15/18 (2013.01); G10L 17/02 (2013.01); G10L 17/12 (2013.01); G10L 17/16 (2013.01); G10L 17/22 (2013.01); G10L 15/20 (2013.01); G10L 15/26 (2013.01); G10L 2015/025 (2013.01); G10L 2015/088 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
under control of one or more computing devices configured to execute specific instructions,
receiving audio data representing an utterance;
generating speaker identifier data based at least partly on the audio data;
generating natural language understanding (“NLU”) data based at least partly on analyzing the audio data using an NLU subsystem, wherein the NLU data represents a command;
identifying, based at least partly on the NLU data, an application of a plurality of applications to generate at least a portion of a response to the utterance; and
sending the speaker identifier data to the application, wherein the application generates a response, customized to a user profile of the application, based on the speaker identifier data.