US 12,412,559 B2
	Voice generation for virtual characters
Zeng Dai, Los Angeles, CA (US); Chen Sun, Los Angeles, CA (US); Ari Shapiro, Los Angeles, CA (US); Kin Chung Wong, Los Angeles, CA (US); Weishan Yu, Culver City, CA (US); and August Yadon, Los Angeles, CA (US)
Assigned to Lemon Inc., Grand Cayman (KY)
Filed by Lemon Inc., Grand Cayman (KY)
Filed on May 23, 2022, as Appl. No. 17/751,324.
Prior Publication US 2023/0377556 A1, Nov. 23, 2023
Int. Cl. G10L 13/02 (2013.01); G06T 13/20 (2011.01); G06T 13/40 (2011.01)

CPC G10L 13/02 (2013.01) [G06T 13/205 (2013.01); G06T 13/40 (2013.01)]

19 Claims

1. A method of generating voices for virtual characters, comprising:

receiving a plurality of source sounds, wherein the plurality of source sounds correspond to a plurality of frames of a video, the video comprising a virtual character;

converting the plurality of source sounds into a plurality of representations in a latent space using a first model, wherein each representation among the plurality of representations comprises a plurality of parameters;

generating a plurality of sounds for the virtual character in the video in real time as the plurality of source sounds are received based on modifying at least one of the plurality of parameters of each representation in the latent space;

driving movements of the virtual character in the video by utilizing landmark coordinates generated based on input images by a second model; and

improving the movements of the virtual character in the video by using the plurality of source sounds as extra input to the second model, wherein the second model is configured to control the movements of the virtual character in the video.