US 12,148,415 B2
Systems and methods for synthesizing speech
Peng Zhang, Hangzhou (CN); Xinhui Hu, Hangzhou (CN); Xinkang Xu, Hangzhou (CN); and Jian Lu, Hangzhou (CN)
Assigned to ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD., Hangzhou (CN)
Filed by ZHEJIANG TONGHUASHUN INTELLIGENT TECHNOLOGY CO., LTD., Zhejiang (CN)
Filed on Sep. 11, 2023, as Appl. No. 18/465,143.
Application 18/465,143 is a continuation of application No. 17/445,385, filed on Aug. 18, 2021, granted, now 11,798,527.
Claims priority of application No. 202010835266.3 (CN), filed on Aug. 19, 2020; and application No. 202011148521.3 (CN), filed on Oct. 23, 2020.
Prior Publication US 2023/0419948 A1, Dec. 28, 2023
Int. Cl. G10L 13/00 (2006.01); G10L 13/047 (2013.01); G10L 15/00 (2013.01)
CPC G10L 13/047 (2013.01) 20 Claims
OG exemplary drawing
 
1. A method, that is implemented on a computing device having at least one processor and at least one storage medium including a set of instructions for synthesizing a speech, comprising:
generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model is configured to output the speech corresponding to the text and a stop token indicating where the speech should stop;
obtaining an evaluation index, wherein the evaluation index includes a second effect score of the speech synthesis model, and the second effect score is generated based on at least one of a duration of the speech and a correct ending position of a sentence corresponding to the speech; and
training the speech synthesis model when the evaluation index meets a preset condition.