US 11,769,482 B2
Method and apparatus of synthesizing speech, method and apparatus of training speech synthesis model, electronic device, and storage medium
Wenfu Wang, Beijing (CN); Tao Sun, Beijing (CN); Xilei Wang, Beijing (CN); Junteng Zhang, Beijing (CN); Zhengkun Gao, Beijing (CN); and Lei Jia, Beijing (CN)
Assigned to Beijing Baidu Netcom Science Technology Co., Ltd., Beijing (CN)
Filed by Beijing Baidu Netcom Science Technology Co., Ltd., Beijing (CN)
Filed on Sep. 29, 2021, as Appl. No. 17/489,616.
Claims priority of application No. 202011253104.5 (CN), filed on Nov. 11, 2020.
Prior Publication US 2022/0020356 A1, Jan. 20, 2022
Int. Cl. G10L 13/10 (2013.01); G10L 25/30 (2013.01)
CPC G10L 13/10 (2013.01) [G10L 25/30 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A method of synthesizing a speech, comprising:
acquiring a style information of a speech to be synthesized, a tone information of the speech to be synthesized, and a content information of a text to be processed;
generating an acoustic feature information of the text to be processed, by using a pre-trained speech synthesis model, based on the style information, the tone information, and the content information of the text to be processed; and
synthesizing the speech for the text to be processed, based on the acoustic feature information of the text to be processed,
wherein the acquiring the style information of the speech to be synthesized comprises:
acquiring a description information of an input style of a user; and determining a style identifier, from a preset style table, corresponding to the input style according to the description information of the input style, as the style information of the speech to be synthesized.