| CPC G06Q 10/067 (2013.01) [G06F 16/35 (2019.01); G06F 40/30 (2020.01)] | 17 Claims |

|
1. A method implemented on at least one machine including at least one processor, memory, and communication platform capable of connecting to a network for modeling an entity based on textual information, the method comprising:
obtaining information about an entity to be modeled;
searching textual information related to the entity, including a first plurality of documents related to at least one aspect of the entity;
adding, to each of some of the first plurality of documents, new content to generate augmented documents;
obtaining first aggregated semantic models for a second plurality of documents including the first plurality of documents and the augmented documents, wherein each of the first aggregated semantic models represents one of the second plurality of documents and includes a semantic feature vector and a semantic signature, wherein the semantic signature is generated by an autoencoder via a dimensionality reduction process to reduce a dimensionality of the semantic feature vector, and wherein the autoencoder is previously derived through unsupervised deep learning;
identifying, via clustering based on the first aggregated semantic models, one or more groups of the first aggregated semantic models, wherein each of the one or more groups includes a set of first aggregated semantic models representing semantics of the second plurality of documents;
obtaining each of one or more second aggregated semantic models for each of one or more first pseudo documents by combining each of the one or more groups of the first aggregated semantic models;
deriving, via machine learning, a third aggregated semantic model of a second pseudo document by combining the one or more second aggregated semantic models or by combing at least one of the one or more second aggregated semantic models and at least one of the first aggregated semantic models, to represent each of the at least one aspect of the entity, yielding one or more third aggregated semantic models, wherein each of the third aggregated semantic models for an aspect of the entity comprises an aggregated semantic feature vector and an aggregated semantic signature for characterizing the aspect of the entity; and
identifying, based on similarity of aggregated semantic signatures among the entity and other entities, one or more of the other entities, wherein each of the one or more other entities has at least one aspect that is characterized by the corresponding aggregated semantic signature and is similar to any of the at least one aspect of the entity.
|