CPC G06T 5/70 (2024.01) [G06T 2207/20081 (2013.01); G06T 2207/20084 (2013.01)] | 20 Claims |
1. A method comprising:
receiving a text prompt;
executing a text encoder on the text prompt to generate an embedding representation;
generating a set of base images based on the embedding representation and parameters of a base image generation model;
executing a high resolution model to upsample one or more base images in the set of base images based on parameters of the high resolution model to generate a set of final images;
ranking the set of base images or the set of final images using reward values that are generated by a reward model, wherein the reward model is trained using human input that provided feedback on a quality of generated images using the base image generation model and the high resolution model; and
outputting one or more final images based on the ranking in response to the text prompt.
|