| CPC G06F 16/9532 (2019.01) [G06F 3/0482 (2013.01); G06F 3/0488 (2013.01); G06F 16/9535 (2019.01); G06F 16/9577 (2019.01)] | 20 Claims |

|
1. A computing system for multimodal search, the system comprising:
one or more processors; and
one or more non-transitory computer-readable media that collectively store instructions that, when executed by the one or more processors, cause the computing system to perform operations, the operations comprising:
obtaining image data, wherein the image data is descriptive of one or more images, wherein the one or more images comprise one or more frames obtained from a live camera feed;
processing the image data with an object classification model to determine one or more object classifications for one or more objects depicted in the one or more images;
processing the one or more object classifications to generate one or more multimodal query suggestions, wherein the one or more multimodal query suggestions comprise one or more suggested text strings to provide with at least a portion of the image data to a search engine;
providing the one or more suggested text strings for display with the live camera feed;
obtaining a selection of the one or more suggested text strings associated with the one or more multimodal query suggestions;
generating a multimodal query comprising the one or more suggested text strings and at least one of the one or more images or a current frame of the live camera feed; and
determining one or more search results based on the multimodal query.
|