US 12,462,449 B2
Mask conditioned image transformation based on a text prompt
Ambareesh Revanur, San Jose, CA (US); Debraj Debashish Basu, Sunnvyale, CA (US); Shradha Agrawal, Milpitas, CA (US); Dhwanit Agarwal, San Jose, CA (US); and Deepak Pai, Sunnyvale, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on May 18, 2023, as Appl. No. 18/319,808.
Prior Publication US 2024/0386627 A1, Nov. 21, 2024
Int. Cl. G06F 40/40 (2020.01); G06T 7/11 (2017.01); G06T 11/00 (2006.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01)
CPC G06T 11/001 (2013.01) [G06F 40/40 (2020.01); G06T 7/11 (2017.01); G06V 10/774 (2022.01); G06V 10/82 (2022.01); G06V 20/70 (2022.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving, by a processing device, a text prompt and an input image by a generator network, the generator network including a plurality of layers configured to perform respective edits for the text prompt at different resolutions;
generating, by the processing device, a plurality of masks defining local edit regions, respectively, of the input image for respective layers of the plurality of layers, the plurality of masks based on the text prompt;
generating, by the processing device using the generator network, an edited image by editing the input image based on the plurality of masks, the respective edits of the respective layers, and the text prompt; and
outputting, by the processing device, the edited image.