US 12,443,980 B1
	Text and image based prompt generation
Sravan Sripada, Sammamish, WA (US); Guanglei Xiong, Pleasanton, CA (US); Yashal Shakti Kanungo, Seattle, WA (US); Tor Hamilton Steiner, Redmond, WA (US); and Renuka Mannem, Austin, TX (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 15, 2024, as Appl. No. 18/607,028.
Int. Cl. G06Q 30/02 (2023.01); G06Q 30/0202 (2023.01); G06Q 30/0241 (2023.01); G06T 11/00 (2006.01)

CPC G06Q 30/0276 (2013.01) [G06Q 30/0202 (2013.01); G06T 11/00 (2013.01)]

20 Claims

1. A computer-implemented method, comprising:

receiving a first image showing an item to be used to generate a second image comprising the item;

generating, using a first machine learning model, a first encoding representing a first textual description of the item based at least in part on the first image;

generating, using a second machine learning model, a second encoding representing a second textual description of the item;

determining, using a third machine learning model to process the second encoding, a first characteristic of the item;

receiving a first input indicating a qualifier describing a second characteristic to be represented in the second image;

generating, using the third machine learning model, a prompt for generating the second image of the item to show the item comprising the first characteristic and a background representing the second characteristic by a fourth machine learning model based at least in part on the first encoding, the first characteristic, and the qualifier; and

generating, using the fourth machine learning model, the second image based at least in part on the prompt, the second image showing the item and a background representing the second characteristic.