| CPC G06N 5/022 (2013.01) [G06F 16/9024 (2019.01); G06F 16/90332 (2019.01); G06F 16/93 (2019.01); G06F 40/237 (2020.01); G06F 40/30 (2020.01)] | 20 Claims |

|
1. A computer program product comprising computer executable code embodied in a non-transitory computer readable medium that, when executing on one or more computing devices, performs the steps of:
causing a search of a corpus of documents to obtain a search result including a plurality of documents each mentioning at least one of two entities;
automatically identifying a subset of the plurality of documents wherein each document in the subset contains a co-mention of both of the two entities, wherein the co-mention of both of the two entities comprises mentions of both of the two entities within a same sentence or in adjacent sentences or other surrounding text, in accordance with a rule or a metric used to identify the co-mention of the two entities;
in response to determining that a number of co-mentions of both of the two entities in the subset is above a predetermined threshold, identifying and substantiating a relationship between the two entities based on text in the documents belonging to the subset;
in response to determining that the number of co-mentions of both of the two entities in the subset is below the predetermined threshold, initiating an unsupervised multi-hop search within the plurality of documents for identifying and substantiating the relationship between the two entities through an intermediate third entity forming a chain of co-mentions between the first entity and the second entity, by:
ranking one or more additional entities co-mentioned in the plurality of documents with a first one of the two entities based on graph centrality over normalized entity identifiers;
iteratively evaluating the one or more additional entities with a beam search algorithm using an A* graph search to calculate a distance from the first one of the two entities to a second one of the two entities; and
determining an entity from the one or more additional entities that provides a smallest value for the distance from the first one of the two entities to the second one of the two entities and selecting this entity as the third entity; and
in response to locating the relationship between the two entities through the third entity, providing an identifier for the third entity and one or more segments of text substantiating a supporting co-mention of the third entity and the first one of the two entities.
|