US 12,444,230 B2
System and method for processing media for facial manipulation
Rijul Gupta, Oakland, CA (US)
Assigned to Deep Media Inc., Oakland, CA (US)
Filed by Deep Media, Inc., Oakland, CA (US)
Filed on Sep. 28, 2022, as Appl. No. 17/954,406.
Claims priority of provisional application 63/250,459, filed on Sep. 30, 2021.
Prior Publication US 2023/0114980 A1, Apr. 13, 2023
Int. Cl. G06V 40/16 (2022.01); G06T 3/4092 (2024.01); G06T 5/50 (2006.01)
CPC G06V 40/167 (2022.01) [G06T 3/4092 (2013.01); G06T 5/50 (2013.01); G06V 40/165 (2022.01); G06T 2207/30201 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of processing media for facial manipulation, comprising:
receiving a plurality of input frames;
initiating facial detection on each input frame, wherein the facial detection includes, in response to identifying a face in an input frame, identifying a location of the detected face in the input frame;
bounding the detected faces in the input frames;
cropping the bounded faces in the input frames;
identifying facial landmarks on the detected faces in each cropped input frame;
identifying at least a target face across the cropped input frames and identifying a series of the plurality of input frames that includes the target face;
cropping and adjusting the orientation of the target faces in the series of input frames based on the identified facial landmarks detected in the cropped input frames, so that the target face in each of the series of input frames is in a standard orientation, thereby creating a plurality of aligned target crop frames;
detecting facial landmarks on the target face in each of the aligned target crop frames;
manipulating one or more facial features of the target face in one or more of the aligned target crop frames, thereby creating a plurality of synthetic target faces;
reverting the orientation of each of the synthetic target faces to an original orientation of each corresponding input frame, wherein the reversion of the orientation of each of the synthetic target faces is based on measured movements of the facial landmarks of the target face between the corresponding input frame and aligned target crop frame.