| CPC G06F 16/2457 (2019.01) [G06N 3/0464 (2023.01); G06N 3/08 (2013.01)] | 18 Claims |

|
1. A computer-implemented method, comprising:
receiving a target individual genetic dataset associated with a target individual;
identifying a plurality of matched individuals who genetically match with the target individual;
identifying a plurality of potential ancestors who are potential common ancestors between the target individual and one of the matched individuals;
inputting a set of features related to the target individual to a machine learning model, wherein training of the machine learning model comprises:
generating a plurality of training samples, wherein generating one of the plurality of training samples comprises:
identifying a training target individual, the training target individual having a genetic dataset,
identifying ancestors of the training target individual from one or more family trees of the training target individual,
determining, based on existing family trees that includes the training target individuals, whether each of the ancestors is a direct-line ancestor of the training target individual, and
assigning a positive label to a particular ancestor responsive to the particular ancestor being a direct-line ancestor of the training target individual; and
training the machine learning model using the plurality of training samples;
filtering the plurality of potential common ancestors using the machine learning model to identify a subset of the potential common ancestors of the target individual.
|