US 12,131,803 B2
Methods and systems for detection of somatic structural variants
Giulio Genovese, Cambridge, MA (US); Po-Ru Loh, Cambridge, MA (US); and Steven McCarroll, Cambridge, MA (US)
Assigned to The Broad Institute, Inc., Cambridge, MA (US); and President and Fellows of Harvard College, Cambridge, MA (US)
Appl. No. 16/757,329
Filed by THE BROAD INSTITUTE, INC., Cambridge, MA (US); and PRESIDENT AND FELLOWS OF HARVARD COLLEGE, Cambridge, MA (US)
PCT Filed Oct. 17, 2018, PCT No. PCT/US2018/056342
§ 371(c)(1), (2) Date Apr. 17, 2020,
PCT Pub. No. WO2019/079493, PCT Pub. Date Apr. 25, 2019.
Claims priority of provisional application 62/573,642, filed on Oct. 17, 2017.
Prior Publication US 2020/0303036 A1, Sep. 24, 2020
Int. Cl. G16B 20/10 (2019.01); C12Q 1/6883 (2018.01); C12Q 1/6886 (2018.01); G16B 5/00 (2019.01); G16B 20/20 (2019.01); G16H 10/40 (2018.01)
CPC G16B 20/10 (2019.02) [G16B 5/00 (2019.02); G16B 20/20 (2019.02); G16H 10/40 (2018.01); C12Q 1/6883 (2013.01); C12Q 1/6886 (2013.01)] 31 Claims
OG exemplary drawing
 
1. A computer-implemented method to detect somatic structural variants (SV), comprising;
determining, using one or more computing devices, total and relative allelic intensities for one or more samples, wherein determining the total and relative allelic intensities comprises converting genotype intensity data into log R2 ratio (LRR) and B allele frequency (BAF) values;
masking, using the one or more computing devices, constitutional segmental duplications in each sample of the one or more samples, wherein masking the constitutional segmental duplications comprises modeling, using the one or more computing devices, observed phased BAF deviations (pBAF) and wherein modeling the observed pBAFs is performed by modeling across individual chromosomes using a 25-state hidden Markov model (HMM) with states corresponding to pBAF values;
selecting regions to mask, which comprises computing a Viterbi path through the 25-state HMM and examining contiguous regions of nonzero states;
identifying, using the one or more computing devices, a putative set of somatic SV events for each sample in the one or more samples, wherein identifying the putative set of somatic SV events comprises use of a 3-state HMM and wherein the 3-state HMM is parameterized by a single parameter representing mean absolute BAF deviation (|ΔBAF|) within a given somatic SV event; and
defining, using the one or more computing devices, one or more somatic SV events for each sample of the one or more samples, based at least in part on application of a likelihood ratio test to the putative set of somatic SV events.