US 12,094,034 B1
System and method for face reconstruction from a voice sample
Daniil Ivanov, Tel Aviv (IL); and Arkady Krishtul, Tel Aviv (IL)
Assigned to CORSOUND AI LTD., Tel Aviv (IL)
Filed by Corsound AI Ltd, Tel Aviv (IL)
Filed on Sep. 7, 2023, as Appl. No. 18/462,523.
Int. Cl. G06T 11/60 (2006.01); G06T 11/00 (2006.01); G10L 15/22 (2006.01); G10L 17/02 (2013.01); G10L 17/06 (2013.01)
CPC G06T 11/00 (2013.01) [G10L 17/02 (2013.01); G10L 17/06 (2013.01)] 10 Claims
OG exemplary drawing
 
1. A method for reconstructing a facial image of a speaker from a voice sample of the speaker, the method comprising:
using a voice-face matching model, selecting from a dataset of facial images, a subset of facial images that were associated by the voice-face matching model with the voice sample of the speaker with the highest matching scores, wherein the voice-face matching model is trained to calculate a matching score indicative of the probability that the voice sample and the facial image belong to the same person, by:
generating a facial latent space vector for each facial image in the dataset of facial images, and a voice latent vector for the voice sample of the speaker; and
calculating the matching score between each of the facial latent space vectors and the voice latent vector; and
reconstructing the facial image of the speaker by unifying the facial images in the subset, to generate a single morphed image.