| CPC G16B 30/10 (2019.02) [G06F 16/2255 (2019.01); G06F 16/24578 (2019.01); G16B 30/20 (2019.02)] | 27 Claims | 
| 
               1. A computer-implemented method for determining a nucleotide sequence from nucleotide sequencing reads, comprising: 
            receiving a plurality of first nucleotide sequencing reads and a second nucleotide sequencing read associated with each first nucleotide sequencing read; 
                for each first nucleotide sequencing read and associated second nucleotide sequencing read: 
                generating a plurality of first identifier subsequences from a first identifier sequence of the first nucleotide sequencing read comprising subsequences of the first identifier sequence; 
                  generating a plurality of second identifier subsequences from a second identifier sequence of the second nucleotide sequencing read comprising subsequences of the second identifier sequence; 
                  for each first identifier subsequence and second identifier subsequence, determining a plurality of hashes using a plurality of hash functions; 
                  generating a first signature for the first nucleotide sequencing read comprising a plurality of first signature hashes for a plurality of first positions, wherein a first signature hash is selected from the hashes of the plurality of hashes determined for the plurality of first identifier subsequences at the first position; 
                  generating a second signature for the second nucleotide sequencing read comprising a plurality of second signature hashes for a plurality of second positions, wherein a second signature hash is selected from the hashes of the plurality of hashes determined for the plurality of second identifier subsequences at the second position; and 
                  assigning the first nucleotide sequencing read or the second nucleotide sequencing read to at least one first particular bin of a first hash data structure based on the first signature or based on the second signature, wherein keys of bins of the first hash data structure are stored in a first key data structure and keys of bins of a second hash data structure are stored in a second key data structure and wherein the assigning comprises using a first stored key of the first key data structure or a second stored key of the second key data structure; and 
                determining a nucleotide sequence for each first particular bin of the first hash data structure with one or more first nucleotide sequencing reads assigned. 
               |