US 11,735,162 B2
Text-to-speech (TTS) processing
Jaime Lorenzo Trueba, Cambridge (GB); Thomas Renaud Drugman, Carnieres (BE); Viacheslav Klimkov, Gdansk (PL); Srikanth Ronanki, Cambridge (GB); Thomas Edward Merritt, Cambridge (GB); Andrew Paul Breen, Norwich (GB); and Roberto Barra-Chicote, Cambridge (GB)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Aug. 8, 2022, as Appl. No. 17/882,691.
Application 17/882,691 is a continuation of application No. 16/922,590, filed on Jul. 7, 2020, granted, now 11,410,639.
Application 16/922,590 is a continuation of application No. 16/141,241, filed on Sep. 25, 2018, granted, now 10,741,169, issued on Aug. 11, 2020.
Prior Publication US 2023/0058658 A1, Feb. 23, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G10L 13/10 (2013.01); G10L 25/18 (2013.01); G10L 13/06 (2013.01)
CPC G10L 13/10 (2013.01) [G10L 13/06 (2013.01); G10L 25/18 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving input audio data representing an utterance corresponding to a request to create requested synthesized speech;
processing the input audio data using a first component to determine first acoustic-feature data corresponding to at least one language represented in the utterance;
determining first data representing words corresponding to the requested synthesized speech;
processing the first data to determine second acoustic-feature data;
processing the first acoustic-feature data and the second acoustic-feature data to determine spectrogram data; and
processing the spectrogram data to determine output audio data representing synthesized speech of the words, the synthesized speech corresponding to the at least one language.