US 12,444,411 B1
Multiple results presentation
Denys Derezhenets, Etobicoke (CA); Igor Adirin, Coquitlam (CA); and Gaurav Mehrotra, Markham (CA)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 6, 2023, as Appl. No. 18/461,634.
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 15/18 (2013.01); G10L 13/02 (2013.01); G10L 15/22 (2006.01)
CPC G10L 15/1822 (2013.01) [G10L 13/02 (2013.01); G10L 15/22 (2013.01); G10L 2015/223 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
receiving first input audio data corresponding to a first utterance detected by a device;
performing speech processing using the first input audio data to determine at least a first natural language understanding (NLU) hypothesis and a second NLU hypothesis for the first utterance;
determining that the first NLU hypothesis corresponds to a first intent;
determining that the second NLU hypothesis corresponds to a second intent;
determining, that the first NLU hypothesis more likely represents what the first utterance meant than the second NLU hypothesis;
using a first component to obtain, from a first skill component associated with the first intent, first visual content and results data responsive to the first utterance;
causing the device to present the first visual content;
performing speech synthesis using the results data to generate output audio data responsive to the first utterance;
causing the device to present output audio corresponding to the output audio data;
in response to the first input audio data, using a second component to obtain, from a second skill component, second visual content;
causing the device to present the second visual content while presenting the first visual content;
receiving, from the device, input data corresponding to a second input;
determining that the second input corresponds to the second visual content;
obtaining output content corresponding to the second skill component and responsive to the second input; and
causing the device to present the output content.