CPC G06F 16/285 (2019.01) [G06F 16/2246 (2019.01); G06F 16/288 (2019.01); G06N 20/00 (2019.01); G06F 17/11 (2013.01)] | 20 Claims |
1. A computer-implemented method for genealogical entity resolution comprising:
at an online system comprising memory and one or more processors:
obtaining, from a genealogical tree database, a first tree person from a first genealogical tree and a second tree person from a second genealogical tree, the first genealogical tree and the second genealogical tree each comprising a plurality of interconnected tree persons corresponding to individuals that are related to each other;
identifying a familial category such that the familial category of the first genealogical tree comprises at least one tree person and the familial category of the second genealogical tree comprises a plurality of tree persons;
extracting a first feature for the at least one tree person from the first genealogical tree and a corresponding first feature for each of the plurality of tree persons from the second genealogical tree;
generating a plurality of similarity scores, each similarity score based on a similarity of the first feature for the at least one tree person from the first genealogical tree to the corresponding first feature for respective tree persons of the plurality of tree persons from the second genealogical tree; and
identifying a representative pairing of tree persons within the familial category of the first and second genealogical trees, the representative pairing having a maximum similarity score of the plurality of similarity scores.
|