| CPC G06F 16/532 (2019.01) [G06F 16/243 (2019.01)] | 20 Claims |

|
1. A computer-implemented method for providing text responses to image-based queries:
based on receiving an input image and a natural language query corresponding to the input image, obtaining reverse image search grounding information for the input image;
providing a comprehensive image prompt and the input image to a visual-based large generative model to generate visual image grounding information;
generating a text response to the natural language query corresponding to the input image using a large generative language model based at least in part on the reverse image search grounding information and the visual image grounding information; and
providing the text response in response to the natural language query.
|