CPC G16B 20/10 (2019.02) [G16B 5/00 (2019.02); G16B 20/20 (2019.02); G16H 10/40 (2018.01); C12Q 1/6883 (2013.01); C12Q 1/6886 (2013.01)] | 31 Claims |
1. A computer-implemented method to detect somatic structural variants (SV), comprising;
determining, using one or more computing devices, total and relative allelic intensities for one or more samples, wherein determining the total and relative allelic intensities comprises converting genotype intensity data into log R2 ratio (LRR) and B allele frequency (BAF) values;
masking, using the one or more computing devices, constitutional segmental duplications in each sample of the one or more samples, wherein masking the constitutional segmental duplications comprises modeling, using the one or more computing devices, observed phased BAF deviations (pBAF) and wherein modeling the observed pBAFs is performed by modeling across individual chromosomes using a 25-state hidden Markov model (HMM) with states corresponding to pBAF values;
selecting regions to mask, which comprises computing a Viterbi path through the 25-state HMM and examining contiguous regions of nonzero states;
identifying, using the one or more computing devices, a putative set of somatic SV events for each sample in the one or more samples, wherein identifying the putative set of somatic SV events comprises use of a 3-state HMM and wherein the 3-state HMM is parameterized by a single parameter representing mean absolute BAF deviation (|ΔBAF|) within a given somatic SV event; and
defining, using the one or more computing devices, one or more somatic SV events for each sample of the one or more samples, based at least in part on application of a likelihood ratio test to the putative set of somatic SV events.
|