US 12,086,857 B2
Search with machine-learned model-generated queries
Harshit Kharbanda, Pleasanton, CA (US); Arash Sadr, Belmont, CA (US); Alice Au Quan, San Francisco, CA (US); Belinda Luna Zeng, Cupertino, CA (US); Christopher James Kelley, Orinda, CA (US); Jieming Yu, Jersey City, NJ (US); and Minsang Choi, San Francisco, CA (US)
Assigned to GOOGLE LLC, Mountain View, CA (US)
Filed by Google LLC, Mountain View, CA (US)
Filed on Mar. 31, 2023, as Appl. No. 18/193,890.
Application 18/193,890 is a continuation of application No. 18/173,449, filed on Feb. 23, 2023, granted, now 11,941,678.
Claims priority of provisional application 63/433,559, filed on Dec. 19, 2022.
Prior Publication US 2024/0202795 A1, Jun. 20, 2024
Int. Cl. G06Q 30/00 (2023.01); G06Q 30/0601 (2023.01)
CPC G06Q 30/0627 (2013.01) [G06Q 30/0621 (2013.01); G06Q 30/0643 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
a memory storing instructions that when executed by the one or more processors cause the system to perform operations comprising:
obtaining a prompt input, wherein the prompt input comprises one or more terms descriptive of an absence of a particular detail, and wherein the prompt input comprises a prompt image descriptive of an object with the particular detail;
processing the prompt input with an image generation model to generate one or more model-generated images, wherein the image generation model comprises a diffusion model, wherein the diffusion model comprises one or more transformer models, wherein the one or more model-generated images are generated based at least in part on the one or more terms, and wherein the one or more model-generated images are descriptive of a generated object without the particular detail, wherein generating the one or more model-generated images with the image generation model comprises:
processing the one or more terms with an embedding model to generate a text embedding; and
processing the text embedding and the prompt image with the diffusion model to generate predicted replacement pixels for a region of the prompt image that comprises at least a portion of the object, wherein the predicted replacement pixels are descriptive of the generated object without the particular detail;
determining one or more search results based on processing the one or more model-generated images with a search engine, wherein the one or more search results are associated with one or more objects without the particular detail; and
providing a search results interface, wherein the search results interface provides the one or more search results for display.