US 12,394,419 B2
Compensating for hardware disparities when determining whether to offload assistant-related processing tasks from certain client devices
Vikram Aggarwal, Palo Alto, CA (US); and Suresh Batchu, Sunnyvale, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Appl. No. 17/927,259
Filed by GOOGLE LLC, Mountain View, CA (US)
PCT Filed May 27, 2020, PCT No. PCT/US2020/034756
§ 371(c)(1), (2) Date Nov. 22, 2022,
PCT Pub. No. WO2021/242236, PCT Pub. Date Dec. 2, 2021.
Prior Publication US 2023/0215438 A1, Jul. 6, 2023
Int. Cl. G10L 15/30 (2013.01); G06F 21/31 (2013.01); G10L 15/00 (2013.01); G10L 15/18 (2013.01); G10L 15/22 (2006.01)
CPC G10L 15/30 (2013.01) [G06F 21/31 (2013.01); G10L 15/005 (2013.01); G10L 15/1815 (2013.01); G10L 15/22 (2013.01); G10L 2015/223 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method implemented by one or more processors, the method comprising:
receiving, at an audio interface of a client computing device, an ongoing spoken utterance from a user,
wherein the ongoing spoken utterance is directed to an automated assistant that is accessible via the client computing device;
generating, in response to receiving the ongoing spoken utterance, first audio data characterizing a first portion of the ongoing spoken utterance as the user continues to provide the ongoing spoken utterance to the automated assistant;
providing the first audio data to a server computing device via a network connection between the client computing device and the server computing device,
wherein the server computing device performs speech-to-text processing on the first audio data to generate first textual data;
receiving, by the client computing device, status data from the server computing device in response to the server computing device receiving the first audio data from the client computing device;
determining, based on the status data, whether to provide second audio data to the server computing device for further speech-to-text processing,
wherein the second audio data characterizes a second portion of the ongoing spoken utterance that is received at the audio interface of the client computing device subsequent to the client computing device receiving the first portion of the ongoing spoken utterance; and
when the client computing device determines to not provide the second audio data to the server computing device for further speech-to-text processing:
generating, at the client computing device, second textual data that characterizes other natural language content of the second portion of the ongoing spoken utterance.