US 12,406,136 B2
	Method and system for determining relationships between linguistic entities
Bharathwaj Raghunathan, Mississauga (CA); Vishal Jain, Toronto (CA); Tyler Wagner, Boston, MA (US); Tyler Feener, Toronto (CA); Shimeng Chen, Toronto (CA); Eric Brine, Toronto (CA); Lorenzo Kogler Anele, Toronto (CA); Danylo Oliynyk, Toronto (CA); and Rashik Shahjahan, Toronto (CA)
Assigned to nference, Inc., Cambridge, MA (US)
Filed by nference, inc., Cambridge, MA (US)
Filed on Nov. 4, 2022, as Appl. No. 18/052,697.
Claims priority of provisional application 63/276,342, filed on Nov. 5, 2021.
Prior Publication US 2023/0143418 A1, May 11, 2023
Int. Cl. G06F 40/205 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)

CPC G06F 40/205 (2020.01) [G06F 40/289 (2020.01); G06F 40/30 (2020.01)]

49 Claims

1. A method comprising

receiving, by a computing device, a first entity and a second entity;

accessing a corpus;

preprocessing, by the computing device, the corpus by:

grouping the corpus into a plurality of chunks at a head node;

distributing the plurality of chunks to a plurality of worker nodes configured as parallel processing units within a distributed cluster, wherein each worker node is executed on a separate physical or virtual machine and operates asynchronously;

retrieving a second plurality of sentences from one of the plurality of chunks of the corpus by one of the plurality of worker nodes;

extracting then sending a plurality of extracted entities and extracted relational phrases to the head node;

mapping the extracted relational phrases in a pretrained vector space using a pretrained

model to generate a plurality of extracted relational phrase embeddings;

clustering the extracted relational phrases in the pretrained vector space to generate clustering information for the plurality of extracted relational phrase embeddings; and

storing a mapping of the plurality of extracted entities, the plurality of extracted relational phrase embeddings, and the clustering information for the plurality of extracted relational phrase embeddings;

retrieving, by the computing device, a first plurality of sentences containing the first entity and the second entity from the corpus;

identifying, by the computing device, a plurality of relational phrases by extracting a relational phrase from each of the first plurality of sentences; and

identifying, by the computing device, one or more relationships between the first entity and the second entity.