US 12,093,609 B2
Voice-controlled entry of content into graphical user interfaces
Srikanth Pandiri, Zurich (CH); Luv Kothari, Sunnyvale, CA (US); Behshad Behzadi, Freienbach (CH); Zaheed Sabur, Baar (CH); Domenico Carbotta, Zurich (CH); Akshay Kannan, Fremont, CA (US); Qi Wang, Palo Alto, CA (US); Gokay Baris Gultekin, Palo Alto, CA (US); Angana Ghosh, Mountain View, CA (US); Xu Liu, San Jose, CA (US); Yang Lu, Los Altos, CA (US); and Steve Cheng, Los Altos, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by GOOGLE LLC, Mountain View, CA (US)
Filed on Nov. 9, 2023, as Appl. No. 18/388,465.
Application 18/388,465 is a continuation of application No. 17/619,414, granted, now 11,853,649, previously published as PCT/US2019/066211, filed on Dec. 13, 2019.
Claims priority of provisional application 62/915,607, filed on Oct. 15, 2019.
Prior Publication US 2024/0078083 A1, Mar. 7, 2024
Int. Cl. G06F 3/048 (2013.01); G06F 3/0481 (2022.01); G06F 3/0484 (2022.01); G06F 3/04886 (2022.01); G06F 3/16 (2006.01); G06F 40/117 (2020.01); G06F 40/143 (2020.01); G06F 40/174 (2020.01); G06F 40/30 (2020.01); G10L 15/22 (2006.01); G10L 15/26 (2006.01)
CPC G06F 3/167 (2013.01) [G06F 3/0481 (2013.01); G06F 3/0484 (2013.01); G06F 3/04886 (2013.01); G06F 40/117 (2020.01); G06F 40/143 (2020.01); G06F 40/174 (2020.01); G06F 40/30 (2020.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method implemented by one or more processors, the method comprising:
receiving a selection of a keyboard element that is provided to a graphical user interface of a keyboard application that is being rendered at a computing device,
wherein the keyboard application is separate from an automated assistant that is accessible at the computing device, and
wherein the keyboard application is provided by a third-party entity that is separate from a first-party entity that provided the computing device and/or the automated assistant;
receiving, subsequent to receiving the selection of the keyboard element, a spoken utterance from a user,
wherein the user is accessing a particular application that includes an entry field when the user provides the spoken utterance;
causing, based on the spoken utterance, a candidate text string to be determined,
wherein the candidate text string characterizes at least a portion of the spoken utterance provided by the user, and
wherein the candidate text string is determined by the automated assistant that is provided by the first-party entity;
determining whether to incorporate the candidate text string into the entry field or to incorporate non-textual visual content, that is determined based on the candidate text string, into the entry field; and
in response to determining to incorporate the non-textual visual content into the entry field:
causing the keyboard application, that is provided by the third-party entity, to incorporate the non-textual visual content into the entry field of the particular application.