CPC G10L 13/02 (2013.01) [G10L 13/08 (2013.01)] | 16 Claims |
1. A sample generation method, comprising:
acquiring a plurality of text-audio pairs, wherein each text-audio pair comprises a text segment and an audio segment;
calculating, for each text-audio pair among the plurality of text-audio pairs, an audio feature of the audio segment of the text-audio pair, and screening out from the plurality of text-audio pairs, according to the audio feature, a target text-audio pair and a splicing text-audio pair corresponding to the target text-audio pair;
splicing the target text-audio pair and the splicing text-audio pair into a to-be-detected text-audio pair, and detecting the to-be-detected text-audio pair; and
writing the to-be-detected text-audio pair into a training database in a case that the to-be-detected text-audio pair meets a preset detection condition.
|