US 11,735,290 B2
Estimation of phenotypes using DNA, pedigree, and historical data
Ahna R. Girshick, Berkeley, CA (US); Natalie Telis, Mountain View, CA (US); Julie M. Granka, San Francisco, CA (US); Asher Keith Haug Baltzell, Salt Lake City, UT (US); Shiya Song, San Mateo, CA (US); Genevieve Heather Linnea Roberts, Salt Lake City, UT (US); Shannon Ries McCurdy, Berkeley, CA (US); and Jialiang Gu, Berkeley, CA (US)
Assigned to Ancestry.com DNA, LLC, Lehi, UT (US)
Filed by Ancestry.com DNA, LLC, Lehi, UT (US)
Filed on Jan. 14, 2021, as Appl. No. 17/149,600.
Application 17/149,600 is a continuation of application No. 16/669,530, filed on Oct. 31, 2019, granted, now 10,896,742.
Claims priority of provisional application 62/753,758, filed on Oct. 31, 2018.
Prior Publication US 2021/0134391 A1, May 6, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G16B 20/20 (2019.01); G16B 40/20 (2019.01); G16B 5/20 (2019.01); G16B 20/40 (2019.01); G16B 10/00 (2019.01); G16B 40/30 (2019.01)
CPC G16B 20/20 (2019.02) [G16B 5/20 (2019.02); G16B 10/00 (2019.02); G16B 20/40 (2019.02); G16B 40/20 (2019.02); G16B 40/30 (2019.02)] 20 Claims
 
1. A computer-implemented method for improving a machine learning model used for trait detection, the computer-implemented method comprising:
accessing a target set of DNA features of a target individual;
identifying, based on the DNA features of the target individual or a family tree which includes the target individual, one or more related individuals who are related to the target individual;
accessing one or more non-DNA features associated with the related individuals;
generating a target feature vector that combines the target set of DNA features of the target individual and the one or more non-DNA features associated with the related individuals, the target feature vector including a target set of numerical values with one or more of the numerical values representing one or more of the DNA features and with one or more of the numerical values representing the one or more of the non-DNA features; and
inputting the target feature vector to a machine learning model to generate a prediction of a trait of the target individual, the prediction being a classification of the trait or a probability that the target individual has the trait, wherein training of the machine learning model comprises:
inputting training samples with training labels and training feature vectors that combine DNA features and non-DNA features of the training samples;
determining, in forward propagation, predicted results of the machine learning model; and
adjusting coefficients of the machine learning model in backpropagation by comparing the predicted results to the training labels.