US 11,887,600 B2
Techniques for interpreting spoken input using non-verbal cues
Erika Doggett, Los Angeles, CA (US); Nathan Nocon, Valencia, CA (US); Ashutosh Modi, Dwarka (IN); Joseph Charles Sengir, San Tan Valley, AZ (US); and Maxwell McCoy, Los Angeles, CA (US)
Assigned to DISNEY ENTERPRISES, INC., Burbank, CA (US)
Filed by DISNEY ENTERPRISES, INC., Burbank, CA (US)
Filed on Oct. 4, 2019, as Appl. No. 16/593,938.
Prior Publication US 2021/0104241 A1, Apr. 8, 2021
Int. Cl. G10L 15/25 (2013.01); G10L 15/22 (2006.01); G10L 15/06 (2013.01); G10L 15/26 (2006.01)
CPC G10L 15/25 (2013.01) [G10L 15/063 (2013.01); G10L 15/22 (2013.01); G10L 15/26 (2013.01); G10L 2015/228 (2013.01)] 21 Claims
OG exemplary drawing
 
1. A computer-implemented method for interpreting spoken user input, the method comprising:
determining a first time marker associated with a first prediction that is generated based on a first type of non-verbal cue and a second time marker associated with a second prediction that is generated based on a second type of non-verbal cue;
determining that the first time marker and the second time marker fall within a relevance time window associated with a base time marker for a first text input that has been derived from a first spoken input received from a user;
upon determining that the first time marker and the second time marker fall within the relevance time window, generating a first predicted context based on a function of the first text input, the first prediction, the second prediction, a first weight indicating a relative contribution of the first text input to the first predicted context, a second weight indicating a relative contribution of the first prediction to the first predicted context, and a third weight indicating a relative contribution of the second prediction to the first predicted context; and
transmitting the first text input and the first predicted context to at least one software application that subsequently performs one or more additional actions based on the first text input and the first predicted context.