| CPC A63F 13/54 (2014.09) [G06F 3/16 (2013.01); G06N 3/08 (2013.01); G10L 13/02 (2013.01); G10L 19/04 (2013.01); G10L 25/30 (2013.01); G10L 25/51 (2013.01); A63F 2300/6081 (2013.01)] | 20 Claims |

|
1. A computer-implemented method of generating context-dependent speech audio in a video game, the method comprising:
enabling, by at least one processor of a computing device, gameplay of the video game;
determining, by a video game engine of the video game on the at least one processor, an in-game event for which context-dependent speech audio is to be generated during the gameplay of the video game, wherein the in-game event includes an action performed by a character of the video game;
obtaining, by the video game engine of the video game, contextual information and speech content data relating to a current state of the gameplay;
requesting, by the video game engine of the video game, the context-dependent speech audio from a speech audio generator of the video game;
generating, by the speech audio generator responsive to the request, the context-dependent speech audio by:
inputting the contextual information relating to the current state of the gameplay into a prosody prediction model, wherein the prosody prediction model comprises a trained machine learning model which is configured to generate predicted prosodic features based on the contextual information;
generating, by the prosody prediction model, predicted prosodic features from the input contextual information;
inputting, into a speech audio generation model, input data comprising:
at least the predicted prosodic features; and
the speech content data relating to the current state of the gameplay;
generating, using one or more encoders of the speech audio generation model, an encoded representation of the speech content data dependent on the predicted prosodic features;
decoding, using a decoder of the speech audio generation model, the encoded representation to generate the context-dependent speech audio; and
causing, by the video game engine of the video game, the context-dependent speech audio that matches the current state of the video game to be played among the gameplay of the in-game event.
|