US 12,238,451 B2
Predicting video edits from text-based conversations using neural networks
Uttaran Bhattacharya, Sunnyvale, CA (US); Gang Wu, San Jose, CA (US); Viswanathan Swaminathan, Saratoga, CA (US); and Stefano Petrangeli, Mountain View, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Nov. 14, 2022, as Appl. No. 18/055,301.
Prior Publication US 2024/0163393 A1, May 16, 2024
Int. Cl. H04N 7/00 (2011.01); G06T 11/60 (2006.01)
CPC H04N 7/002 (2013.01) [G06T 11/60 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving an input including a video sequence and text sentences, the text sentences describing a modification to the video sequence;
mapping, by a first neural network, content of the text sentences describing the modification to the video sequence to a candidate video editing operation;
processing, by a second neural network, the video sequence to predict parameter values for the candidate video editing operation; and
generating a modified video sequence by applying the candidate video editing operation with the predicted parameter values to the video sequence.