CPC G06T 11/00 (2013.01) [G06F 40/40 (2020.01); G06T 5/70 (2024.01)] | 20 Claims |
1. A computing system for generating an output image corresponding to an input text, the computing system comprising:
a processor and memory of a computing device, the processor being configured to execute a program using portions of the memory to:
receive the input text from a user;
process an initial image through a first diffusion stage to generate a final first stage image, wherein the first diffusion stage includes processing the initial image, for a first predetermined number of iterations, using a diffusion model, a gradient estimator model smaller than the diffusion model, and a text-image match gradient calculator; and
process the final first stage image through a second diffusion stage to generate a final second stage image, wherein the second diffusion stage includes using the final first stage image as a second stage image to, for a second predetermined number of iterations, perform steps to:
input the second stage image through the diffusion model to generate a second stage processed image;
back-propagate the second stage processed image through the text-image match gradient calculator to calculate a second stage gradient against the input text; and
update the second stage image by applying the second stage gradient to the second stage processed image.
|