CPC G06F 16/5866 (2019.01) [G06F 16/53 (2019.01)] | 16 Claims |
1. A method for combining image data, one or more 3D shapes, and tag data into a unified representation, comprising the steps of:
determining a vector representation for the image data, the vector representation for the image data found within in a fixed vector space of words;
determining a vector representation for the one or more 3D shapes, the vector representation for the one or more 3D shapes found within the vector space of words, by:
computing rendered views for each of the one or more 3D shapes from multiple viewpoints,
computing a descriptor for each view, and
averaging the descriptors for each view by computing a weighted average from the rendered views, the weight of each rendered view determined by a proportion of information of the 3D shape captured by the rendered view;
determining a vector representation for the tag data, the vector representation for the tag data found within the vector space of words; and
combining the vector representations into a unified representation by:
embedding the vector representation for the image data, the one or more 3D shapes, and the vector representation for the tag data into the vector space of words, and
determining linear combinations of the vector representations within the vector space of words to compute the unified representation, wherein image data and tag data are weighted to specify their respective influence.
|