US 11,718,873 B2
Correcting for deamination-induced sequence errors
Marcin Sikora, Redwood City, CA (US); Andrew Kennedy, San Diego, CA (US); Ariel Jaimovich, Redwood City, CA (US); Darya Chudova, San Jose, CA (US); and Stephen Fairclough, Redwood City, CA (US)
Assigned to Guardant Health, Inc., Palo Alto, CA (US)
Filed by GUARDANT HEALTH, INC., Redwood City, CA (US)
Filed on Mar. 23, 2021, as Appl. No. 17/210,202.
Application 17/210,202 is a continuation of application No. 16/866,252, filed on May 4, 2020, granted, now 11,008,616.
Application 16/866,252 is a continuation of application No. PCT/US2018/059056, filed on Nov. 2, 2018.
Claims priority of provisional application 62/581,609, filed on Nov. 3, 2017.
Prior Publication US 2021/0395816 A1, Dec. 23, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. C12Q 1/6874 (2018.01); C12Q 1/6806 (2018.01); C12Q 1/6869 (2018.01)
CPC C12Q 1/6874 (2013.01) [C12Q 1/6806 (2013.01); C12Q 1/6869 (2013.01)] 30 Claims
OG exemplary drawing
 
1. A method for identifying variant nucleotides in a population of cell-free nucleic acids comprising:
(a) contacting a population of cell-free nucleic acids comprising double-stranded DNA molecules with single-stranded overhangs at one or both ends with a protein having 5′-3′ polymerase activity and a 3′-5′ exonuclease activity, wherein the protein digests 3′ overhangs and fills in 5′ overhangs with complementary nucleotides, to generate double-stranded DNA molecules with one or both ends blunt;
(b) tailing blunt ends of the DNA molecules and ligating the resulting DNA molecules to one or more adapters with a complementary tail;
(c) determining sequences of a plurality of the double-stranded DNA molecules to provide sequenced DNA molecules;
(d) for each designated position in a reference sequence,
(i) identifying a subset of sequenced DNA molecules including the designated position, and
(ii) identifying sequenced DNA molecules in the subset in which the designated position is occupied by a variant nucleotide; and
(e) calling presence of a variant nucleotide at each designated position for which the sequenced DNA molecules in step (d)(ii) support the call, except that presence of a variant nucleotide at a designated position is not called if:
(i) the variant is a C to T or G to A variation compared with the reference nucleotide; and
(ii) the variant nucleotide is classified as a deamination error based on:
(1) nucleotide context around the designated position and/or
(2) distance of the C to T variation at the designated position from the 5′-end in sequenced DNA molecules in the subset or distance of the G to A variation at the designated position from the 3′-end in sequenced DNA molecules in the subset, wherein at least one variant nucleotide which would otherwise have been called in step (e) is not called due to conditions (i) and (ii) being determined to have been met.