US 12,223,699 B2
Multimodal machine learning image and text combined search method
Lei Xu, Shanghai (CN); and Deng Feng Wan, Shanghai (CN)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on May 10, 2022, as Appl. No. 17/740,479.
Prior Publication US 2023/0368509 A1, Nov. 16, 2023
Int. Cl. G06V 10/80 (2022.01); G06F 40/20 (2020.01); G06N 20/20 (2019.01); G06V 10/74 (2022.01); G06V 10/82 (2022.01)
CPC G06V 10/806 (2022.01) [G06F 40/20 (2020.01); G06N 20/20 (2019.01); G06V 10/761 (2022.01); G06V 10/82 (2022.01)] 17 Claims
OG exemplary drawing
 
1. A computer-implemented method comprising:
receiving from a repository a plurality of items, wherein each item has an image and a textual description associated therewith;
generating a first image feature vector, for a first item, by processing a respective image using a first machine learning model;
generating a first textual feature vector, for the first item, by processing a respective textual description using a second machine learning model;
combining, for the first item, the first image feature vector for the first item and the first textual feature vector for the first item, to generate a first combined feature vector for the first item;
generating, for the first item, a first similarity list of similar items, wherein a first similar item is included in the first similarity list based on a similarity between the first image feature vector for the first item and a similar image feature vector for the similar item;
generating, for the first item, a second similarity list of similar items, wherein a second similar item is included in the second similarity list based on a similarity between the first text feature vector for the first item and a similar text feature vector for the second similar item;
generating, for the first item, a third similarity list of similar items, wherein a third similar item is included in the third similarity list based on a similarity between the first combined feature vector for the first item and a similar combined feature vector for the third similar item;
combining the first similarity list for the first item, the second similarity list for the first item, and the third similarity list for the first item to generate a combined similarity list for the first item by:
identifying a first set of similar items for the first item that are in only one of the first similarity list for the first item, the second similarity list for the first item, or the third similarity list for the first item, and
including the first set of similar items and respective similarity values for the first set of similar items in the combined similarity list of similar items for the first item;
receiving a request for information for the first item;
retrieving the combined similarity list of similar items for the first item; and
providing the combined similarity list of similar items for the first item in response to the request.