US 12,406,136 B2
Method and system for determining relationships between linguistic entities
Bharathwaj Raghunathan, Mississauga (CA); Vishal Jain, Toronto (CA); Tyler Wagner, Boston, MA (US); Tyler Feener, Toronto (CA); Shimeng Chen, Toronto (CA); Eric Brine, Toronto (CA); Lorenzo Kogler Anele, Toronto (CA); Danylo Oliynyk, Toronto (CA); and Rashik Shahjahan, Toronto (CA)
Assigned to nference, Inc., Cambridge, MA (US)
Filed by nference, inc., Cambridge, MA (US)
Filed on Nov. 4, 2022, as Appl. No. 18/052,697.
Claims priority of provisional application 63/276,342, filed on Nov. 5, 2021.
Prior Publication US 2023/0143418 A1, May 11, 2023
Int. Cl. G06F 40/205 (2020.01); G06F 40/289 (2020.01); G06F 40/30 (2020.01)
CPC G06F 40/205 (2020.01) [G06F 40/289 (2020.01); G06F 40/30 (2020.01)] 49 Claims
OG exemplary drawing
 
1. A method comprising
receiving, by a computing device, a first entity and a second entity;
accessing a corpus;
preprocessing, by the computing device, the corpus by:
grouping the corpus into a plurality of chunks at a head node;
distributing the plurality of chunks to a plurality of worker nodes configured as parallel processing units within a distributed cluster, wherein each worker node is executed on a separate physical or virtual machine and operates asynchronously;
retrieving a second plurality of sentences from one of the plurality of chunks of the corpus by one of the plurality of worker nodes;
extracting then sending a plurality of extracted entities and extracted relational phrases to the head node;
mapping the extracted relational phrases in a pretrained vector space using a pretrained
model to generate a plurality of extracted relational phrase embeddings;
clustering the extracted relational phrases in the pretrained vector space to generate clustering information for the plurality of extracted relational phrase embeddings; and
storing a mapping of the plurality of extracted entities, the plurality of extracted relational phrase embeddings, and the clustering information for the plurality of extracted relational phrase embeddings;
retrieving, by the computing device, a first plurality of sentences containing the first entity and the second entity from the corpus;
identifying, by the computing device, a plurality of relational phrases by extracting a relational phrase from each of the first plurality of sentences; and
identifying, by the computing device, one or more relationships between the first entity and the second entity.