US 12,217,759 B2
	Voice to text conversion based on third-party agent content
Barnaby James, Los Gatos, CA (US); Bo Wang, San Jose, CA (US); Sunil Vemuri, Pleasanton, CA (US); David Schairer, San Jose, CA (US); Ulas Kirazci, Mountain View, CA (US); Ertan Dogrultan, Belmont, CA (US); and Petar Aleksic, Jersey City, NJ (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by GOOGLE LLC, Mountain View, CA (US)
Filed on Feb. 6, 2024, as Appl. No. 18/434,602.
Application 18/434,602 is a continuation of application No. 18/125,606, filed on Mar. 23, 2023, granted, now 11,922,945.
Application 18/125,606 is a continuation of application No. 17/582,926, filed on Jan. 24, 2022, granted, now 11,626,115, issued on Apr. 11, 2023.
Application 17/582,926 is a continuation of application No. 16/791,334, filed on Feb. 14, 2020, granted, now 11,232,797, issued on Jan. 25, 2022.
Application 16/791,334 is a continuation of application No. 15/372,188, filed on Dec. 7, 2016, granted, now 10,600,418, issued on Mar. 24, 2020.
Prior Publication US 2024/0274133 A1, Aug. 15, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/26 (2006.01); G06F 40/205 (2020.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G10L 15/18 (2013.01); G10L 15/183 (2013.01); G10L 15/22 (2006.01); G10L 15/30 (2013.01)

CPC G10L 15/26 (2013.01) [G06F 40/205 (2020.01); G06F 40/284 (2020.01); G06F 40/30 (2020.01); G10L 15/1815 (2013.01); G10L 15/183 (2013.01); G10L 15/22 (2013.01); G10L 15/30 (2013.01); G10L 2015/223 (2013.01); G10L 2015/228 (2013.01)]

20 Claims

1. A voice-enabled electronic device comprising:

at least one processor; and

memory storing instructions that, when executed by the at least one processor, cause the at least one processor to be operable to:

receive, from a third-party agent, one or more contextual parameters associated with the third-party agent, wherein the third-party agent is managed by an additional party that is distinct from the party that manages a local agent of the voice-enabled electronic device;

receive, from a user of the voice-enabled electronic device, a voice input provided by the user;

in response to receiving the voice input:

convert, using a voice to text model, the voice input to text, wherein the instructions to convert the voice input to text comprise instructions to use one or more of the contextual parameters to bias the voice to text model in converting at least one segment of the voice input to the text; and

transmit at least a portion of the text to the third-party agent.