CPC G16B 20/20 (2019.02) [G16B 5/20 (2019.02); G16B 10/00 (2019.02); G16B 20/40 (2019.02); G16B 40/20 (2019.02); G16B 40/30 (2019.02)] | 20 Claims |
1. A computer-implemented method for improving a machine learning model used for trait detection, the computer-implemented method comprising:
accessing a target set of DNA features of a target individual;
identifying, based on the DNA features of the target individual or a family tree which includes the target individual, one or more related individuals who are related to the target individual;
accessing one or more non-DNA features associated with the related individuals;
generating a target feature vector that combines the target set of DNA features of the target individual and the one or more non-DNA features associated with the related individuals, the target feature vector including a target set of numerical values with one or more of the numerical values representing one or more of the DNA features and with one or more of the numerical values representing the one or more of the non-DNA features; and
inputting the target feature vector to a machine learning model to generate a prediction of a trait of the target individual, the prediction being a classification of the trait or a probability that the target individual has the trait, wherein training of the machine learning model comprises:
inputting training samples with training labels and training feature vectors that combine DNA features and non-DNA features of the training samples;
determining, in forward propagation, predicted results of the machine learning model; and
adjusting coefficients of the machine learning model in backpropagation by comparing the predicted results to the training labels.
|