US 11,866,777 B2
Error suppression in sequenced DNA fragments using redundant reads with unique molecular indices (UMIS)
Sante Gnerre, Mountain View, CA (US); Byoungsok Jung, Atherton, CA (US); Emrah Kostem, Redwood City, CA (US); Alex Aravanis, San Mateo, CA (US); Alex So, San Diego, CA (US); Xuyu Cai, Natick, MA (US); Zhihong Zhang, San Diego, CA (US); and Frank J. Steemers, Encinitas, CA (US)
Assigned to Illumina, Inc., San Diego, CA (US)
Filed by Illumina, Inc., San Diego, CA (US)
Filed on Oct. 21, 2020, as Appl. No. 17/076,715.
Application 17/076,715 is a continuation of application No. 15/130,668, filed on Apr. 15, 2016, granted, now 10,844,428, issued on Nov. 24, 2020.
Claims priority of provisional application 62/269,485, filed on Dec. 18, 2015.
Claims priority of provisional application 62/193,469, filed on Jul. 16, 2015.
Claims priority of provisional application 62/153,699, filed on Apr. 28, 2015.
Prior Publication US 2021/0108262 A1, Apr. 15, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. C12Q 1/6869 (2018.01); G16B 30/00 (2019.01); C12N 15/10 (2006.01); C12Q 1/6806 (2018.01); G16B 30/10 (2019.01); C12Q 1/6855 (2018.01)
CPC C12Q 1/6869 (2013.01) [C12N 15/1065 (2013.01); C12Q 1/6806 (2013.01); C12Q 1/6855 (2013.01); G16B 30/00 (2019.02); G16B 30/10 (2019.02)] 14 Claims
 
1. A method for sequencing nucleic acid molecules from a sample using unique molecular indices (UMIs), wherein each unique molecular index (UMI) is an oligonucleotide sequence that can be used to identify an individual molecule of a double-stranded DNA fragment in the sample, comprising
(a) applying adapters to both ends of double-stranded DNA fragments in the sample to obtain DNA-adapter products, wherein: each adapter comprises a double-stranded hybridized region, a single-stranded 5′ arm, a single-stranded 3′ arm, and a physical UMI on one strand or each strand of the adapter, each double-stranded DNA fragment comprises a virtual UMI, and the virtual UMI is a unique sub-sequence in a DNA fragment in the sample, and the plurality of double-stranded DNA fragments is not obtained by restriction endonuclease digestion;
(b) amplifying both strands of the DNA-adapter products to obtain a plurality of amplified polynucleotides;
(c) sequencing the plurality of amplified polynucleotides, thereby obtaining a plurality of reads each comprising a physical UMI sequence corresponding to a physical UMI on an adapter and a virtual UMI sequence corresponding to a virtual UMI on a double-stranded DNA fragment in the sample;
(d) identifying a plurality of physical UMI sequences for the plurality of reads;
(e) identifying a plurality of virtual UMI sequences for the plurality of reads; and
(f) determining sequences of the double-stranded DNA fragments in the sample using the plurality of reads obtained in (c), the plurality of physical UMI sequences identified in (d), and the plurality of virtual UMI sequences identified in (e), wherein (f) comprises:
(i) combining, for each double-stranded DNA fragment, a first plurality of reads and a second plurality of reads to determine a consensus nucleotide sequence, each read of the first plurality of reads comprising a first physical UMI sequence of the plurality of physical UMI sequences and a first virtual UMI sequence of the plurality of UMI sequences but not a second physical UMI sequence of the plurality of physical UMI sequences, each read of the second plurality of reads comprising the second physical UMI sequence and the first virtual UMI sequence but not the first physical UMI sequence; and
(ii) determining a sequence of the double-stranded DNA fragment using the consensus nucleotide sequence.