US 12,278,999 B2
	Generation of video stream having localized lip-syncing with personalized characteristics
Jun Su, Beijing (CN); Yang Liang, Beijing (CN); Luis Osvaldo Pizana, Austin, TX (US); and Su Liu, Austin, TX (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Jun. 21, 2023, as Appl. No. 18/212,624.
Prior Publication US 2024/0430497 A1, Dec. 26, 2024
Int. Cl. H04N 21/2343 (2011.01); H04N 21/24 (2011.01)

CPC H04N 21/2343 (2013.01) [H04N 21/2402 (2013.01)]

20 Claims

1. A computer-implemented method, comprising:

detecting cultural context and accents of speakers portrayed in a video stream, the speakers speaking a source language, a first of the speakers having an accent when speaking in the source language that is different than an accent of a second of the speakers when the second speaker is speaking in the source language;

selecting accent tags for the speakers according to the detected cultural context and accents of the speakers, wherein the accent tags include data objects identifying the source language and the accent of the respective speaker;

translating a textual representation of spoken words of the speakers from the source language to a target language;

applying the accent tags to the textual representation of the spoken words in the target language according to the speakers corresponding to the textual representation of the spoken words in the target language;

modifying speech lip movements of the speakers portrayed in the video stream to match speech lip movements characteristic of the target language and speech lip movements characteristic of the accents of the speakers according to the applied accent tags; and

outputting a translated video stream having the speakers appearing to speak in the target language with the modified lip movements and to speak in the target language with the respective accents corresponding to the source language according to the applied accent tags.