US 12,258,634 B2
Fragmentation for measuring methylation and disease
Yuk-Ming Dennis Lo, Hong Kong (CN); Rossa Wai Kwun Chiu, Hong Kong (CN); Kwan Chee Chan, Hong Kong (CN); Peiyong Jiang, Hong Kong (CN); Qing Zhou, Hong Kong (CN); Guannan Kang, Hong Kong (CN); Rong Qiao, Hong Kong (CN); and Lu Ji, Hong Kong (CN)
Assigned to Centre for Novostics, New Territories (HK)
Filed by Centre for Novostics, New Territories (HK)
Filed on Mar. 6, 2023, as Appl. No. 18/117,992.
Application 18/117,992 is a continuation of application No. 18/106,793, filed on Feb. 7, 2023.
Claims priority of provisional application 63/400,244, filed on Aug. 23, 2022.
Claims priority of provisional application 63/328,710, filed on Apr. 7, 2022.
Claims priority of provisional application 63/307,622, filed on Feb. 7, 2022.
Prior Publication US 2023/0374601 A1, Nov. 23, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. C12Q 1/6886 (2018.01); C12Q 1/6806 (2018.01); C12Q 1/6851 (2018.01); G16B 20/00 (2019.01); G16B 20/30 (2019.01); G16B 30/10 (2019.01); G16B 40/00 (2019.01); G16B 40/20 (2019.01); G16B 50/00 (2019.01)
CPC C12Q 1/6886 (2013.01) [C12Q 1/6851 (2013.01); G16B 20/00 (2019.02); G16B 20/30 (2019.02); G16B 30/10 (2019.02); G16B 40/00 (2019.02); G16B 40/20 (2019.02); G16B 50/00 (2019.02); C12Q 1/6806 (2013.01); C12Q 2600/154 (2013.01)] 26 Claims
 
1. A method of analyzing a biological sample of a subject to determine a level of a pathology, wherein the pathology is a cancer, in the biological sample of the subject, the biological sample including cell-free DNA, the method comprising performing, by a computer system:
receiving, over a network connection or from a computer-readable medium, sequence reads obtained from an assay performed on a plurality of cell-free DNA molecules from the biological sample to obtain sequence reads, wherein the sequence reads include ending sequences corresponding to ends of the plurality of cell-free DNA molecules;
for each of the plurality of cell-free DNA molecules, determining a sequence motif for each of one or more ends of the cell-free DNA molecule, wherein an end of a cell-free DNA molecule has a first position at an outermost position, a second position that is next to the first position, and a third position that is next to the second position, wherein the plurality of cell-free DNA molecules includes at least 10,000 cell-free DNA molecules;
determining a first set of amounts of a first set of end sequence motifs of the plurality of cell-free DNA molecules, wherein:
each of the first set of end sequence motifs has C at the first position and G at the second position, or
each of the first set of end sequence motifs has C at the second position and G at the third position;
generating a feature vector including the first set of amounts, the feature vector generated using end sequence motifs only selected from a group consisting of (1) end sequence motifs having C at the first position and G at the second position and (2) end sequence motifs having C at the second position and G at the third position; inputting the feature vector into a machine learning model, wherein the machine learning model is trained using cell-free DNA molecules in training samples having known classifications;
determining, using the machine learning model and the feature vector, a probability for the level of the pathology;
determining a classification of the level of the cancer for the subject based on a comparison of the probability to a cutoff value, wherein the classification is that the subject has the cancer; and
administering a treatment to the subject, wherein the treatment includes radiation therapy, immunotherapy, chemotherapy, hormone therapy, stem cell transplant, or surgery to treat the cancer.