US 11,676,606 B2
Context-sensitive dynamic update of voice to text model in a voice-enabled electronic device
Yuli Gao, Sunnyvale, CA (US); Sangsoo Sung, Palo Alto, CA (US); and Prathab Murugesa, Mountain View, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Aug. 9, 2021, as Appl. No. 17/397,592.
Application 17/397,592 is a continuation of application No. 16/665,309, filed on Oct. 28, 2019, granted, now 11,087,762.
Application 16/665,309 is a continuation of application No. 15/969,291, filed on May 2, 2018, granted, now 10,482,883, issued on Nov. 19, 2019.
Application 15/969,291 is a continuation of application No. 14/723,250, filed on May 27, 2015, granted, now 9,966,073, issued on May 8, 2018.
Prior Publication US 2021/0366484 A1, Nov. 25, 2021
Int. Cl. G10L 15/26 (2006.01); G10L 15/22 (2006.01); G10L 15/065 (2013.01); G10L 15/18 (2013.01); G10L 15/08 (2006.01)
CPC G10L 15/26 (2013.01) [G10L 15/22 (2013.01); G10L 15/065 (2013.01); G10L 15/083 (2013.01); G10L 15/1822 (2013.01); G10L 2015/223 (2013.01); G10L 2015/228 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method implemented by one or more processors of a client device, the method comprising:
generating, using a locally stored list of relevant context sensitive entities, voice to text model update data for a context sensitive parameter that is associated with the relevant context sensitive entities, wherein the voice to text model update data comprises decoding paths for the relevant context sensitive entities;
subsequent to generating the voice to text model update data:
receiving a voice input with a voice-enabled electronic device, the voice input including an original request that includes first and second portions, the second portion including a first context sensitive entity among the relevant context sensitive entities; and
in the voice-enabled electronic device, and responsive to receiving the first portion of the voice input:
performing local processing of the first portion of the voice input;
determining during the local processing that the first portion is associated with the context sensitive parameter;
in response to determining that the first portion is associated with the context sensitive parameter:
dynamically updating a local voice to text model, used by the voice-enabled electronic device, using the locally generated voice to text model update data, wherein dynamically updating the local voice to text model facilitates recognition of the first context sensitive entity in performing local processing of the second portion of the voice input;
generating, utilizing the dynamically updated local voice to text model, a recognition of the second portion of the voice input; and
causing performance of a voice action that is based on the recognition of the second portion of the voice input.