US 12,204,866 B1
Voice based searching and dialog management system
Srinivasa Sandeep Atluri, Seattle, WA (US); Constantin Daniel Marcu, Rolling Hills, CA (US); Kevin Small, Seattle, WA (US); Kemal Oral Cansizlar, Seattle, WA (US); Vijit Singh, Seattle, WA (US); Li Zhou, Seattle, WA (US); Aritra Biswas, Seattle, WA (US); and Bhanu Pratap Jain, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 10, 2021, as Appl. No. 17/472,203.
Int. Cl. G06F 40/35 (2020.01); G06F 16/632 (2019.01); G10L 13/08 (2013.01); G10L 15/18 (2013.01)
CPC G06F 40/35 (2020.01) [G06F 16/632 (2019.01); G10L 13/08 (2013.01); G10L 15/18 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving, from a device, first audio data corresponding to a first spoken input;
determining that the first spoken input requests information corresponding to a first entity;
sending, to an item retrieval component, a first request to search for items corresponding to the first entity;
in response to the first request, receiving, from the item retrieval component, a plurality of item results;
determining a first attribute corresponding to a first subset of item results from the plurality of item results;
determining, based on the first attribute, a second spoken input to be provided by a user to request to view the first subset of item results;
based on determining the second spoken input, determining first data responsive to the second spoken input, wherein the first data is determined based on the first subset of item results;
in response to the first spoken input, sending, to the device, a second subset of item results from the plurality of item results;
after determining the first data, sending, to the device, a representation of the second spoken input;
after sending the representation of the second spoken input and after determining the first data, receiving, from the device, second audio data corresponding to the second spoken input; and
in response to the second audio data corresponding to the second spoken input, sending, to the device, the first data.