US 12,271,978 B1
Content synthesis using generative Artificial Intelligence model
Rahim Entezari, London (GB); Patrick Esser, London (GB); Robin Rombach, London (GB); and Andreas Blattmann, London (GB)
Assigned to Stability AI Ltd, London (GB)
Filed by Stability AI Ltd, London (GB)
Filed on Sep. 11, 2024, as Appl. No. 18/882,690.
Claims priority of provisional application 63/633,020, filed on Apr. 11, 2024.
Claims priority of provisional application 63/567,127, filed on Mar. 19, 2024.
Int. Cl. G06F 40/00 (2020.01); G06F 40/40 (2020.01); G06T 11/00 (2006.01)
CPC G06T 11/00 (2013.01) [G06F 40/40 (2020.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more storage media storing instructions; and
one or more processors configured to execute the instructions to cause the system to:
receive a prompt describing a desired characteristic of an image;
generate, using a set of encoding models, a prompt encoding based on the prompt;
generate, using a first transformer block of a diffusion transformer model, a first prompt embedding and a first image embedding based on the prompt encoding and a noise input;
generate, using a second transformer block of the diffusion transformer model, a second image embedding based on the first image embedding and the first prompt embedding; and
generate the image based on the second image embedding.