US 12,217,830 B2
Estimating tumor purity from single samples
Nicholas Phillips, Menlo Park, CA (US); and Jason Harris, Menlo Park, CA (US)
Assigned to Personalis, Inc., Fremont, CA (US)
Filed by Personalis, Inc., Menlo Park, CA (US)
Filed on May 3, 2022, as Appl. No. 17/735,904.
Application 17/735,904 is a continuation of application No. PCT/US2020/058951, filed on Nov. 4, 2020.
Claims priority of provisional application 62/931,096, filed on Nov. 5, 2019.
Prior Publication US 2022/0259678 A1, Aug. 18, 2022
Int. Cl. G16B 40/20 (2019.01); C12Q 1/6886 (2018.01); G16B 20/20 (2019.01); C12Q 1/686 (2018.01); C12Q 1/6874 (2018.01)
CPC G16B 40/20 (2019.02) [C12Q 1/6886 (2013.01); G16B 20/20 (2019.02); C12Q 1/686 (2013.01); C12Q 1/6874 (2013.01); C12Q 2600/156 (2013.01)] 14 Claims
 
1. A method of determining tumor purity of a biological sample of a subject for informing a cancer feature and evaluating a treatment efficacy for the subject, the method comprising:
obtaining nucleic acid sequence data from one or more sequencers that represent a plurality of nucleic acid molecules of the biological sample of the subject;
aligning the nucleic acid sequence data to a reference genome;
identifying, based on the aligned nucleic acid sequence data, a set of genomic regions, wherein each genomic region of the set of genomic regions includes one or more nucleotide-sequence variants relative to a corresponding genomic region of the reference genome;
determining a B-allele frequency for each genomic region of the set of genomic regions;
determining, based on the B-allele frequencies of the set of genomic regions, a B-allele frequency distribution for the biological sample;
processing the B-allele frequency distribution using a trained machine-learning model to estimate a probability of a true tumor purity as a function of a predicted tumor purity in the biological sample, wherein the trained machine-learning model is trained on a training dataset generated from nucleic acid sequence data derived from one or more tumor cells diluted into normal cells; and
generating a report to inform the cancer feature and evaluate the treatment efficacy for the subject based on the estimated probability of a true tumor purity as a function of a predicted tumor purity in the biological sample.