US 12,488,809 B2
	Modification of objects in film
Scott Mann, London (GB); Hyeongwoo Kim, London (GB); Sean Danischevsky, London (GB); Rob Hall, London (GB); and Gary Myles Scullion, London (GB)
Assigned to Flawless Holdings Limited, London (GB)
Filed by Flawless Holdings Limited, London (GB)
Filed on Nov. 17, 2023, as Appl. No. 18/513,174.
Application 18/513,174 is a continuation of application No. PCT/GB2022/051338, filed on May 26, 2022.
Application PCT/GB2022/051338 is a continuation of application No. 17/561,346, filed on Dec. 23, 2021, granted, now 11,398,255, issued on Jul. 26, 2022.
Claims priority of provisional application 63/203,354, filed on Jul. 19, 2021.
Claims priority of provisional application 63/193,553, filed on May 26, 2021.
Prior Publication US 2024/0087610 A1, Mar. 14, 2024
Int. Cl. G11B 27/036 (2006.01); G06N 3/08 (2023.01); G06T 3/18 (2024.01); G06T 5/70 (2024.01); G06T 5/77 (2024.01); G06V 10/82 (2022.01); G06V 20/40 (2022.01); G06V 40/16 (2022.01)

CPC G11B 27/036 (2013.01) [G06N 3/08 (2013.01); G06T 3/18 (2024.01); G06T 5/70 (2024.01); G06T 5/77 (2024.01); G06V 10/82 (2022.01); G06V 20/44 (2022.01); G06V 40/161 (2022.01); G06T 2207/10016 (2013.01); G06T 2207/10024 (2013.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/30201 (2013.01)]

17 Claims

1. A system comprising:

one or more processors; and

one or more non-transitory computer-readable media storing instructions which, when executed by the one or more processors, cause the one or more processors to perform operations comprising:

obtaining source video data comprising a plurality of sequences of image frames;

detecting respective instances of an object within at least some sequences of image frames of the plurality of sequences of image frames, the object being a human face; and

for a first instance of the object detected within a first sequence of image frames of the source data:

determining a framewise location and size of the first instance of the object in the first sequence of image frames;

obtaining, using a neural renderer, replacement video data comprising a modified instance of the object; and

replacing, using the determined framewise location and size, at least part of the first instance of the object in the first sequence of image frames with at least part of the modified instance of the object,

wherein obtaining the replacement video data comprises:

processing at least a portion of each image frame of the respective sequence of image frames to generate a three-dimensional synthetic model of the first instance of the object;

modifying the three-dimensional synthetic model; and

generating the replacement video data using the neural renderer and the modified three-dimensional synthetic model,

wherein modifying the three-dimensional synthetic model comprises:

obtaining driving data comprising an audio and/or video recording including speech;

processing the driving data to determine modified parameter values for the three-dimensional synthetic model corresponding to the speech; and

using the modified parameter values to modify the three-dimensional synthetic model, including progressively transitioning between unmodified parameter values for the three-dimensional synthetic model and the modified parameter values for the three-dimensional synthetic model in dependence on when speech is taking place in the driving data.