US 12,217,827 B2
	Detecting fetal sub-chromosomal aneuploidies
Darya I. Chudova, San Jose, CA (US); and Diana Abdueva, Orinda, CA (US)
Assigned to Verinata Health, Inc., San Diego, CA (US)
Filed by Verinata Health, Inc., San Diego, CA (US)
Filed on Apr. 25, 2019, as Appl. No. 16/395,066.
Application 16/395,066 is a continuation of application No. 14/726,183, filed on May 29, 2015, granted, now 10,318,704.
Claims priority of provisional application 62/005,877, filed on May 30, 2014.
Prior Publication US 2019/0318805 A1, Oct. 17, 2019
This patent is subject to a terminal disclaimer.
Int. Cl. C12Q 1/6886 (2018.01); C12Q 1/6858 (2018.01); C12Q 1/6869 (2018.01); G16B 20/00 (2019.01); G16B 20/10 (2019.01); G16B 20/20 (2019.01); G16B 30/00 (2019.01); G16B 30/10 (2019.01); G16B 30/20 (2019.01); G16H 50/30 (2018.01)

CPC G16B 20/10 (2019.02) [C12Q 1/6858 (2013.01); G16B 20/00 (2019.02); G16B 20/20 (2019.02); G16B 30/00 (2019.02); G16B 30/10 (2019.02); C12Q 1/6869 (2013.01)]

21 Claims

1. A method, implemented at a computer system that includes one or more processors and system memory, for evaluation of copy number of a sequence of interest in a test sample comprising nucleic acids, the method comprising:

(a) sequencing DNA in the test sample using a sequencer to obtain sequence reads;

(b) obtaining, by the computer system, the sequence reads;

(c) aligning, by the computer system, the sequence reads of the test sample to a reference genome comprising the sequence of interest, thereby providing test sequence tags, wherein the reference genome is divided into a plurality of bins, wherein the sequence of interest is in a sub-chromosomal genomic region in which a copy number variation is associated with a genetic syndrome;

(d) determining, by the computer system, coverages of the test sequence tags for bins in the sequence of interest;

(e) adjusting, by the computer system, the coverages of the test sequence tags for the bins in the sequence of interest using expected coverages for the bins in the sequence of interest obtained from a subset of a training set of unaffected training samples without using unaffected samples outside the subset, wherein training samples in the subset were obtained by:

selecting unaffected samples from the training set of unaffected training samples that are more highly correlated with each other than training samples of the training set of unaffected training samples not in the subset are correlated with each other, and wherein the correlation is based on coverages in a plurality of bins outside the sequence of interest; and

(f) making, by the computer system, a call of the copy number variation of the sequence of interest in the test sample based on the adjusted coverages from (e).