| CPC G10L 13/047 (2013.01) | 20 Claims |

|
1. A method, that is implemented on a computing device having at least one processor and at least one storage medium including a set of instructions for synthesizing a speech, comprising:
generating the speech based on a text with a speech synthesis model, wherein the speech synthesis model is configured to output the speech corresponding to the text and a stop token indicating where the speech should stop;
obtaining an evaluation index, wherein the evaluation index includes a second effect score of the speech synthesis model, and the second effect score is generated based on at least one of a duration of the speech and a correct ending position of a sentence corresponding to the speech; and
training the speech synthesis model when the evaluation index meets a preset condition.
|