US 12,254,597 B2
	Semantic-aware initial latent code selection for text-guided image editing and generation
Cameron Smith, San Jose, CA (US); Wei-An Lin, San Jose, CA (US); Timothy M. Converse, San Francisco, CA (US); Shabnam Ghadar, San Jose, CA (US); Ratheesh Kalarot, San Jose, CA (US); John Nack, San Jose, CA (US); Jingwan Lu, Santa Clara, CA (US); Hui Qu, Santa Clara, CA (US); Elya Shechtman, Seattle, WA (US); and Baldo Faieta, San Jose, CA (US)
Assigned to Adobe Inc., San Jose, CA (US)
Filed by Adobe Inc., San Jose, CA (US)
Filed on Mar. 30, 2022, as Appl. No. 17/709,221.
Prior Publication US 2023/0316475 A1, Oct. 5, 2023
Int. Cl. G06T 5/50 (2006.01); G06N 3/045 (2023.01)

CPC G06T 5/50 (2013.01) [G06N 3/045 (2023.01); G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01); G06T 2207/20221 (2013.01)]

20 Claims

1. A method, comprising:

receiving, by a request module, an input text and a request for a blended image;

generating, by a contrastive language-image pre-training (“CLIP”) module for the input text, an input text CLIP code;

selecting, by an initial latent code selection module, an initial latent code from among a set of latent codes, the selection based on a the initial latent code having a corresponding CLIP code with a greatest semantic similarity to the input text CLIP code;

generating, by a latent code blending module, a blended image latent code by blending the initial latent code with an input image latent code determined for an input image; and

generating, by a latent code generator module, the blended image from the blended image latent code; and

transmitting, by the request module responsive to the request, the blended image.