US 11,715,466 B2
Systems and methods for local interpretation of voice queries
Ankur Anil Aher, Kalyan (IN); Kiran Das B, Mysore (IN); Jyothi Ekambaram, Bangalore (IN); and Nishchit Mahajan, Amritsar (IN)
Assigned to Rovi Guides, Inc., San Jose, CA (US)
Filed by Rovi Guides, Inc., San Jose, CA (US)
Filed on Nov. 21, 2019, as Appl. No. 16/690,400.
Prior Publication US 2021/0158807 A1, May 27, 2021
Int. Cl. G10L 15/22 (2006.01); G10L 15/30 (2013.01); G10L 15/26 (2006.01); H04M 1/725 (2021.01)
CPC G10L 15/22 (2013.01) [G10L 15/26 (2013.01); G10L 15/30 (2013.01); G10L 2015/223 (2013.01); H04M 1/725 (2013.01)] 16 Claims
OG exemplary drawing
 
1. A method for interpreting a voice query, the method comprising:
receiving, at a local device, a voice query;
determining an audio characteristic of the voice query;
accessing a table, stored at the local device, of a plurality of stored voice queries that have been previously received, wherein the table comprises a number of instances that each of the plurality of stored voice queries has been matched with previously received voice queries;
for each respective stored voice query of the plurality of stored voice queries:
determining whether an audio characteristic of the respective stored voice query is similar to the audio characteristic of the voice query; and
in response to determining that the audio characteristic of the respective stored voice query is similar to the audio characteristic of the voice query, adding the respective voice query to a data structure of candidate voice queries;
comparing the voice query with the candidate voice queries in order of the number of instances for the candidate voice queries;
identifying, based on the comparing, a candidate voice query that matches the voice query;
retrieving, from the table, text corresponding to the candidate voice query;
performing an action corresponding to the text; and
in response to determining that the data structure of candidate voice queries does not contain a candidate voice query that matches the voice query:
transmitting the voice query to a remote server for transcription;
receiving a transcription of the voice query from the remote server;
storing, in the table, the voice query and the transcription; and
further comprising:
determining whether the storage size of the table exceeds a threshold size; and
in response to determining that the storage size of the table exceeds the threshold size, reducing the amount of data stored in the table;
wherein reducing the amount of data stored in the table comprises:
determining a frequency with which each stored voice query is received; and
in response to determining that the frequency of a particular stored voice query is below a threshold frequency, removing the particular stored voice query from the table.