CPC G10L 15/20 (2013.01) [G06F 3/165 (2013.01); G06F 3/167 (2013.01); G10L 15/222 (2013.01); G10L 17/06 (2013.01); G10L 21/034 (2013.01); G10L 25/84 (2013.01); H03G 3/3005 (2013.01); G10L 15/26 (2013.01); G10L 17/00 (2013.01)] | 20 Claims |
1. A computer-implemented method executed on data processing hardware that causes the data processing hardware to perform operations comprising:
receiving a first query spoken by a user and captured by a microphone of a computing device associated with the user;
providing, for audible playback from the computing device, a text-to-speech (TTS) output generated by a TTS system associated with the computing device, the TTS output comprising synthesized audio that conveys a response to the first query;
while the computing device is audibly playing back the TTS output:
detecting a barge-in event from the user to provide a second query;
in response to detecting the barge-in event, initiating a reduction in an audio output level of the computing device; and
receiving an audio signal captured by the microphone that conveys the second query spoken by the user; and
providing the audio signal characterizing the second query to a speech recognition engine.
|