CPC G10L 21/00 (2013.01) | 7 Claims |
1. A voice signal conversion model learning device comprising:
a processor; and
a storage medium having computer program instructions stored thereon, wherein the computer program instruction, when executed by the processor, perform processing of:
executing generation processing of generating a conversion destination voice signal on the basis of an input voice signal that is a voice signal of an input voice, conversion source attribute information that is information indicating an attribute of an input voice that is a voice represented by the input voice signal, and conversion destination attribute information indicating an attribute of a voice represented by the conversion destination voice signal that is a voice signal of a conversion destination of the input voice signal; and
executing estimation processing of estimating whether or not a voice signal that is a processing target is a voice signal representing a vocal sound actually uttered by a person on the basis of the conversion source attribute information and the conversion destination attribute information, wherein
the conversion destination voice signal is input to the processing of execution of generation processing,
the processing target is a voice signal input to the processing of execution of generation processing, and
the processing of execution of generation processing and the processing of execution of voice estimation processing are learned on the basis of an estimation result of the voice estimation processing.
|