US 12,249,319 B2
	Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface
Pu-Sen Chao, Los Altos, CA (US); Diego Melendo Casado, Mountain View, CA (US); and Ignacio Lopez Moreno, New York, NY (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by GOOGLE LLC, Mountain View, CA (US)
Filed on Nov. 13, 2023, as Appl. No. 18/389,033.
Application 18/389,033 is a continuation of application No. 17/120,906, filed on Dec. 14, 2020, granted, now 11,817,085.
Application 17/120,906 is a continuation of application No. 15/769,023, granted, now 10,896,672, issued on Jan. 19, 2021, previously published as PCT/US2018/027812, filed on Apr. 16, 2018.
Prior Publication US 2024/0194191 A1, Jun. 13, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/14 (2006.01); G06F 3/16 (2006.01); G10L 15/00 (2013.01); G10L 15/02 (2006.01); G10L 15/08 (2006.01); G10L 15/18 (2013.01); G10L 15/183 (2013.01); G10L 15/22 (2006.01); G10L 15/30 (2013.01)

CPC G10L 15/14 (2013.01) [G06F 3/167 (2013.01); G10L 15/005 (2013.01); G10L 15/02 (2013.01); G10L 15/1822 (2013.01); G10L 15/183 (2013.01); G10L 15/22 (2013.01); G10L 15/30 (2013.01); G10L 2015/088 (2013.01); G10L 2015/223 (2013.01); G10L 2015/228 (2013.01)]

16 Claims

1. A method implemented by one or more processors, the method comprising:

receiving audio data corresponding to a spoken utterance of a user, the audio data being based on detection of the spoken utterance by a client device;

processing the audio data using a first speech recognition model corresponding to a first language;

determining, based on processing the audio data using the first speech recognition model, content that is responsive to the spoken utterance;

monitoring for an additional spoken input from the user;

receiving, during the monitoring, additional audio data corresponding to an additional spoken utterance, the additional audio data being based on detection of the additional spoken utterance by the client device;

determining, based on receiving the additional audio data, that the additional spoken utterance is provided by an additional user;

accessing, based on the additional spoken utterance being provided by the additional user, a user profile corresponding to the additional user;

determining, based on accessing the user profile corresponding to the additional user, that the user profile provides a correspondence between the additional user and a second language;

based on determining that the user profile provides the correspondence between the additional user and the second language:

using a second speech recognition model, for the second language, in processing the additional audio data; and

causing the client device to render further responsive content based on the processing of the additional audio data using the second speech recognition model.