CPC H04N 21/8106 (2013.01) [G10L 13/02 (2013.01); H04N 21/4884 (2013.01)] | 18 Claims |
1. A multimedia data generating method, comprising:
receiving text information inputted by a user;
displaying, in response to a recording trigger operation for the text information, the text information and acquiring a first reading speech of the text information;
generating a first multimedia data based on the text information and the first reading speech and displaying the first multimedia data; and
marking, in a case of detecting that a match rate between a first target speech segment and a first target text segment is lower than a match rate threshold while acquiring the first reading speech, the first target speech segment, and the first target text segment,
wherein the first multimedia data comprise the first reading speech and a video image matched with the text information, the first multimedia data comprise a plurality of first multimedia segments, the plurality of first multimedia segments corresponding to a plurality of text segments included in the text information, respectively; wherein a first target multimedia segment comprises a first target video segment and the first target speech segment, the first target multimedia segment referring to a first multimedia segment in the plurality of first multimedia segments corresponding to the first target text segment in the plurality of text segments, the first target video segment including a video image matched with the first target text segment, the first target speech segment including a reading speech of the first target text segment.
|