US 11,657,900 B2
Methods for identifying DNA copy number changes using hidden markov model based estimations
Srinka Ghosh, San Francisco, CA (US)
Assigned to AFFYMETRIX, INC., Carlsbad, CA (US)
Filed by Affymetrix, Inc., Carlsbad, CA (US)
Filed on Jan. 10, 2019, as Appl. No. 16/245,165.
Application 16/245,165 is a continuation of application No. 12/143,754, filed on Jun. 20, 2008, granted, now 10,229,244.
Claims priority of provisional application 60/945,132, filed on Jun. 20, 2007.
Prior Publication US 2019/0287650 A1, Sep. 19, 2019
Int. Cl. G16B 40/30 (2019.01); G16B 40/00 (2019.01); C12Q 1/6827 (2018.01); G16B 20/20 (2019.01); G16B 20/40 (2019.01); G16B 20/10 (2019.01); G16B 20/00 (2019.01); G16B 25/00 (2019.01)
CPC G16B 40/30 (2019.02) [C12Q 1/6827 (2013.01); G16B 20/10 (2019.02); G16B 20/20 (2019.02); G16B 20/40 (2019.02); G16B 40/00 (2019.02); G16B 20/00 (2019.02); G16B 25/00 (2019.02)] 20 Claims
 
1. A computer-implemented method for estimating a copy number of each of a plurality of genomic regions in a nucleic acid sample comprising a plurality of nucleic acid molecules, each genomic region containing at least one single nucleotide polymorphism (SNP), the method comprising:
in an assay, hybridizing on a nucleic acid array comprising a plurality of perfect match probes without corresponding mismatch probes, a plurality of nucleic acid molecules with a plurality of allele-specific perfect match probes for at least one SNP;
obtaining, by a computer comprising a processor and a memory, an initial intensity measurement for each of the plurality of allele-specific perfect match probes for the at least one SNP, wherein initial intensity measurements obtained for the plurality of allele-specific perfect match probes for the at least one SNP do not include intensities for any mismatch probes;
generating a global reference from a plurality of control reference samples, wherein the plurality of control reference samples comprises a minimum number of samples estimated by testing for convergence of a distribution of probe-level quantile normalization of SNP data as a function of a number of samples;
normalizing, by the processor, the global reference and the initial intensity measurement for each of the plurality of allele-specific perfect match probes, resulting in normalized intensity measurements, wherein the normalized intensity measurements are determined for the plurality of allele-specific perfect match probes without utilizing data derived from any mismatch probes;
calculating, by the processor, an initial copy number estimate for each of the plurality of genomic regions, wherein the initial copy number estimate is based upon the normalized intensity measurements and normalized intensity measurements of the global reference;
performing, by the processor, data smoothing on the initial copy number estimates, wherein the data smoothing reduces noise within the initial copy number estimates to generate smoothed copy number estimates; and
estimating, by the processor, the copy number of each of the plurality of genomic regions using a Hidden Markov Model to assign the smoothed copy number estimates to different copy number states.