US 12,254,889 B2
Method, apparatus and system for hybrid speech synthesis
Ahmed Mustafa, Erlangen Bayer (DE); and Arijit Biswas, Schwaig bei Nuremberg (DE)
Assigned to DOLBY INTERNATIONAL AB, Amsterdam Zuidoost (NL)
Appl. No. 17/419,047
Filed by DOLBY INTERNATIONAL AB, Amsterdam Zuidoost (NL)
PCT Filed Dec. 20, 2019, PCT No. PCT/EP2019/086656
§ 371(c)(1), (2) Date Jun. 28, 2021,
PCT Pub. No. WO2020/141108, PCT Pub. Date Jul. 9, 2020.
Claims priority of provisional application 62/787,831, filed on Jan. 3, 2019.
Claims priority of application No. 19150154 (EP), filed on Jan. 3, 2019.
Prior Publication US 2022/0059107 A1, Feb. 24, 2022
Int. Cl. G10L 25/30 (2013.01); G10L 13/047 (2013.01); G10L 19/032 (2013.01); G10L 19/08 (2013.01); G06N 3/08 (2023.01)
CPC G10L 19/08 (2013.01) [G10L 13/047 (2013.01); G10L 19/032 (2013.01); G06N 3/08 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A method of decoding an original speech signal for hybrid adversarial-parametric speech synthesis, wherein the method includes the steps of:
(a) receiving quantized original linear prediction coding parameters estimated by applying linear prediction coding analysis filtering to an original speech signal and a quantized compressed representation of a residual of the original speech signal;
(b) dequantizing the original linear prediction coding parameters and the compressed representation of the residual;
(c) inputting the dequantized compressed representation of the residual into a decoder part of a Generator for applying adversarial mapping from the compressed residual domain to a fake (first) signal domain;
(d) outputting, by the decoder part of the Generator, a fake speech signal;
(e) applying linear prediction coding analysis filtering to the fake speech signal for obtaining a corresponding fake residual; and
(f) reconstructing the original speech signal by applying linear prediction coding cross-synthesis filtering to the fake residual and the dequantized original linear prediction coding analysis parameters.