US 12,406,750 B2
Encoding digital data using oligonucleotides
Allan Eduardo Feitosa, Butantã (BR); Thiago Yuji Aoyagi, Butantã (BR); Adriano Galindo Leal, Butantã (BR); Andre Guilherme da Costa-Martins, Butantã (BR); Cristina Maria Ferreira da Silva, Butantã (BR); Diego Trindade de Souza, Butantã (BR); Eduardo Takeo Ueda, Butantã (BR); Marcelo Gonzaga de Oliveira Parada, Butantã (BR); and Bruno Marinaro Verona, Butantã (BR)
Assigned to Lenovo (Singapore) Pte. Ltd., Singapore (SG)
Filed by Lenovo (Singapore) Pte. Ltd., Singapore (SG)
Filed on Aug. 31, 2023, as Appl. No. 18/459,312.
Prior Publication US 2025/0078958 A1, Mar. 6, 2025
Int. Cl. G16B 50/50 (2019.01); G16B 30/10 (2019.01); G06N 3/12 (2023.01)
CPC G16B 50/50 (2019.02) [G16B 30/10 (2019.02); G06N 3/12 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more processors; and
memory storing computer-readable instructions that, when executed by the one or more processors, cause the one or more processors to perform operations comprising:
obtaining an amount of digital data stored in one or more data files;
determining a string of characters that corresponds to the digital data according to a first encoding scheme such that individual characters of the string of characters are represented by a nucleotide included in at least one of deoxyribonucleic acid (DNA) or ribonucleic acid (RNA);
generating nucleic acid representations based on the string of characters, wherein individual nucleic acid representations include a plurality of positions and a portion of the string of characters is distributed among the nucleic acid representations at an individual position across the nucleic acid representations according to a second encoding scheme;
receiving a request to access at least a portion of the digital data;
obtaining a plurality of sequencing reads that correspond to nucleic acids synthesized according to the nucleic acid representations, wherein the plurality of sequencing reads indicate nucleotides present at individual positions of the nucleic acids;
performing a process to align individual positions of the plurality of sequencing reads to generate aligned sequencing reads;
for a given position of the aligned sequencing reads, determining an additional string of characters that corresponds to a nucleotide present at the given position of the aligned sequencing reads; and
generating, using a decoding scheme, a requested portion of the digital data based on the additional string of characters.