OG exemplary drawing
1. An artificial intelligence (AI)-based voice sampling apparatus for providing a speech style in a heterogeneous label, the apparatus comprising:
a rhyme encoder configured to receive a user's voice, extract a voice sample, and analyze a vocal feature included in the voice sample;
a text encoder configured to receive an input of text for reflecting the vocal feature;
a processor configured to:
classify the voice sample input to the rhythm encoder into a label according to the vocal feature,
provide a weight by measuring a distance between a voice sample corresponding to the label and a voice sample corresponding to a heterogeneous label as a label other than the label or provide a weight by measuring a similarity between the label and the heterogeneous label,
extract an embedding vector representing the vocal feature,
generate a speech style from the embedding vector, and
apply the generated speech style to the text; and
a rhyme decoder configured to output synthesized voice data in which the speech style is applied to the text by the processor.