US 12,002,138 B2
Speech-driven animation method and apparatus based on artificial intelligence
Shiyin Kang, Shenzhen (CN); Deyi Tuo, Shenzhen (CN); Kuongchi Lei, Shenzhen (CN); Tianxiao Fu, Shenzhen (CN); Huirong Huang, Shenzhen (CN); and Dan Su, Shenzhen (CN)
Assigned to TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed by TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED, Shenzhen (CN)
Filed on Oct. 8, 2021, as Appl. No. 17/497,622.
Application 17/497,622 is a continuation of application No. PCT/CN2020/105046, filed on Jul. 28, 2020.
Claims priority of application No. 201910820742.1 (CN), filed on Aug. 29, 2019.
Prior Publication US 2022/0044463 A1, Feb. 10, 2022
Int. Cl. G06T 13/20 (2011.01); G06N 3/04 (2023.01); G06N 3/045 (2023.01); G06T 13/40 (2011.01); G10L 15/04 (2013.01); G10L 15/187 (2013.01); G10L 15/30 (2013.01)
CPC G06T 13/205 (2013.01) [G06N 3/045 (2023.01); G06T 13/40 (2013.01); G10L 15/04 (2013.01); G10L 15/187 (2013.01); G10L 15/30 (2013.01)] 19 Claims
OG exemplary drawing
 
1. A speech-driven animation method, performed by an audio and video processing device, the method comprising:
obtaining a first speech with an acoustic feature, the first speech comprising a plurality of speech frames;
determining linguistics information corresponding to a speech frame in the first speech by applying a neural network mapping model to extract the acoustic feature, the linguistics information being used for identifying a distribution possibility that the speech frame in the first speech pertains to phonemes;
determining an expression parameter corresponding to the speech frame in the first speech according to the linguistics information, wherein the expression parameters do not reflect pronunciation habits of different speakers; and
enabling, according to the expression parameter, an animation character to make an expression corresponding to the first speech.