US 11,960,548 B2
	System and method for genealogical entity resolution
Tyler Folkman, Lehi, UT (US); Rey Furner, Lehi, UT (US); and Drew Pearson, Lehi, UT (US)
Assigned to Ancestry.com Operations Inc., Lehi, UT (US)
Appl. No. 17/261,458
Filed by Ancestry.com Operations Inc., Lehi, UT (US)
PCT Filed Jul. 22, 2019, PCT No. PCT/US2019/042807 § 371(c)(1), (2) Date Jan. 19, 2021, PCT Pub. No. WO2020/018991, PCT Pub. Date Jan. 23, 2020.
Claims priority of provisional application 62/701,322, filed on Jul. 20, 2018.
Prior Publication US 2021/0319003 A1, Oct. 14, 2021
Int. Cl. G06F 16/00 (2019.01); G06F 16/215 (2019.01); G06F 16/22 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06F 16/906 (2019.01); G06F 18/22 (2023.01); G06N 3/045 (2023.01); G06N 20/20 (2019.01)

CPC G06F 16/906 (2019.01) [G06F 16/215 (2019.01); G06F 16/2246 (2019.01); G06F 16/258 (2019.01); G06F 16/287 (2019.01); G06F 18/22 (2023.01); G06N 3/045 (2023.01); G06N 20/20 (2019.01)]

20 Claims

1. A computer-implemented method comprising:

extracting a first set of features from a first set of tree data and a second set of features from a second set of tree data, wherein the first set of tree data corresponds to a first tree person from a first genealogical tree and the second set of tree data corresponds to a second tree person from a second genealogical tree, and wherein each of the first genealogical tree and the second genealogical tree comprise a plurality of interconnected nodes representing relationships between tree persons;

generating, utilizing a feature comparator to compare the first set of features and the second set of features:

an individual-level similarity vector comprising paired individual-level features from paired tree persons, wherein the paired tree persons include: i) one or more of the first tree person or a first relative related to the first tree person and ii) one or more of the second tree person or a second relative related to the second tree person; and

a family-level similarity score from features across familial relationships of the paired tree persons;

generating, utilizing an individual-level machine learning model to analyze the individual-level similarity vector, an individual-level similarity score defining a similarity between the paired tree persons;

determining, utilizing a family-level machine learning model based on the individual-level similarity score and the family-level similarity score, that the first tree person and the second tree person are duplicates of a single individual; and

modifying a cluster database based on determining that the first tree person and the second tree person are duplicates of the single individual.