US 12,260,937 B2
Reverse concatenation of error-correcting codes in DNA data storage
Sergey Yekhanin, Redmond, WA (US); Sivakanth Gopi, Redmond, WA (US); Henry Pfister, Durham, NC (US); and Karin Strauss, Seattle, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
Filed on Sep. 5, 2019, as Appl. No. 16/562,183.
Prior Publication US 2021/0074380 A1, Mar. 11, 2021
Int. Cl. G16B 30/10 (2019.01); G06F 11/08 (2006.01); G06F 16/903 (2019.01)
CPC G16B 30/10 (2019.02) [G06F 11/085 (2013.01); G06F 16/90344 (2019.01)] 19 Claims
 
1. A method comprising:
by a computing system, for input nucleotide symbol strings representing input data to be encoded as nucleotides, converting the input nucleotide symbol strings to constrained nucleotide symbol strings completely representing the input nucleotide symbol strings and satisfying a consecutive homopolymer coding constraint that comprises limiting homopolymer runs to n consecutive instances, wherein n is an integer greater than 0;
by the computing system, after converting the input nucleotide symbol strings to the constrained nucleotide symbol strings, calculating a redundancy code for the constrained nucleotide symbol strings, wherein the redundancy code carries redundancy information for the constrained nucleotide symbol strings and comprises a plurality of redundancy code nucleotide symbols;
by the computing system, incorporating the redundancy code nucleotide symbols of the redundancy code and the constrained nucleotide symbol strings into result nucleotide symbol strings, wherein the result strings satisfy a relaxed version of the consecutive homopolymer coding constraint, completely represent the input nucleotide symbol strings, and comprise the redundancy information for the constrained nucleotide symbol strings, wherein incorporating the redundancy code nucleotide symbols comprises interleaving the redundancy code nucleotide symbols into the constrained nucleotide symbol strings while satisfying the relaxed version of the consecutive homopolymer coding constraint;
instructing an oligonucleotide synthesizer to chemically synthesize DNA molecules according to the result nucleotide symbol strings; and
synthesizing a plurality of nucleotide strands according to the result nucleotide symbol strings, wherein the nucleotide strands satisfy the relaxed version of the consecutive homopolymer constraint and represent redundancy code nucleotide symbols calculated after converting the input converting the input nucleotide symbol strings to the constrained nucleotide symbol strings.