CPC G06F 9/541 (2013.01) [G06F 16/221 (2019.01); G06F 16/2456 (2019.01); G06F 16/252 (2019.01); G06F 16/27 (2019.01); G06F 16/84 (2019.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] | 20 Claims |
1. A system, comprising:
a processor;
a non-transitory computer-readable medium; and
stored instructions translatable by the processor for implementing a text mining engine within a first subsystem, the text mining engine configured for ingesting, through a content ingestion pipeline, disparate contents obtained or received by disparate crawlers through real-time data feeds from disparate content sources, the ingesting the disparate contents comprising:
inferring semantic metadata from the disparate contents;
dynamically mapping the semantic metadata to an internal ingestion pipeline document, the internal ingestion pipeline document conforming to a uniform mapping schema that defines a set of master metadata of interest that can be captured in the internal ingestion pipeline document, wherein the dynamically mapping captures the semantic metadata or a portion thereof in the internal ingestion pipeline document in accordance with the uniform mapping schema; and
mapping the semantic metadata or the portion thereof captured in the internal ingestion pipeline document to metadata tables in a central repository to thereby persist the semantic metadata or the portion thereof in the central repository, the metadata tables conforming to a single common data model of the central repository, the metadata tables accessible by a second subsystem configured for providing a Web service to a client device.
|