| CPC G16B 50/50 (2019.02) [G16B 30/10 (2019.02); G06N 3/12 (2013.01)] | 20 Claims |

|
1. A system comprising:
one or more processors; and
memory storing computer-readable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
obtaining an amount of digital data stored in one or more data files;
determining a string of characters that corresponds to the digital data according to a first encoding scheme such that individual characters of the string of characters are represented by a nucleotide included in at least one of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA);
generating nucleic acid representations based on the string of characters, wherein individual nucleic acid representations include a plurality of positions and a portion of the string of characters is distributed among the nucleic acid representations at an individual position across the nucleic acid representations according to a second encoding scheme;
receiving a request to access at least a portion of the digital data;
obtaining a plurality of sequencing reads that correspond to nucleic acids synthesized according to the nucleic acid representations, wherein the plurality of sequencing reads indicate nucleotides present at individual positions of the nucleic acids;
performing a process to align individual positions of the plurality of sequencing reads to generate aligned sequencing reads;
for a given position of the aligned sequencing reads, determining an additional string of characters that corresponds to a nucleotide present at the given position of the aligned sequencing reads; and
generating, using a decoding scheme, a requested portion of the digital data based on the additional string of characters.
|