CPC G06T 11/00 (2013.01) [G10L 17/02 (2013.01); G10L 17/06 (2013.01)] | 10 Claims |
1. A method for reconstructing a facial image of a speaker from a voice sample of the speaker, the method comprising:
using a voice-face matching model, selecting from a dataset of facial images, a subset of facial images that were associated by the voice-face matching model with the voice sample of the speaker with the highest matching scores, wherein the voice-face matching model is trained to calculate a matching score indicative of the probability that the voice sample and the facial image belong to the same person, by:
generating a facial latent space vector for each facial image in the dataset of facial images, and a voice latent vector for the voice sample of the speaker; and
calculating the matching score between each of the facial latent space vectors and the voice latent vector; and
reconstructing the facial image of the speaker by unifying the facial images in the subset, to generate a single morphed image.
|