CPC G16B 30/00 (2019.02) [C12Q 1/6886 (2013.01); G06N 20/00 (2019.01); G16B 20/10 (2019.02); G16H 10/40 (2018.01); G16H 10/60 (2018.01); G16H 50/20 (2018.01); G16H 50/50 (2018.01); G16H 50/70 (2018.01); C12Q 2600/112 (2013.01)] | 15 Claims |
1. A method of determining a cancer class of a subject, comprising:
extracting a plurality of cell-free DNA molecules in a biological sample acquired from a subject;
removing, from the plurality of cell-free DNA molecules, cell-free DNA molecules longer than a first threshold length to obtain a pool of size-selected cell-free DNA molecules, wherein the first threshold length is less than 160 nucleotides;
sequencing the biological sample based on the pool of size-selected cell-free DNA molecules to obtain a plurality of size-selected sequence reads, wherein the plurality of size-selected sequence reads comprise at least 60,000 sequence reads;
identifying, from the plurality of size-selected sequence reads, a relative copy number at each respective genomic location in at least fifty genomic locations in the genome of the subject; and
applying the identified relative copy numbers into a machine learning model trained to determine the cancer class for the subject based on the relative copy number at each respective genomic location, wherein the machine learning model is trained with a training dataset labeled by cancer class.
|