CPC H04L 69/04 (2013.01) | 10 Claims |
1. A system, comprising:
one or more processors coupled to memory, the one or more processors configured to:
extract a plurality of first sequences from a sequence file;
generate a respective plurality of encoded sequences based on the plurality of first sequences extracted from the sequence file, wherein the respective plurality of encoded sequences are run-length encodings (RLE) of the plurality of first sequences;
generate a hash table that stores the respective plurality of encoded sequences;
combine at least two entries in the hash table based on a comparison of data generated from at least two of the respective plurality of encoded sequences, wherein combining the at least two entries comprises
determining that at least two RLE sequences of the respective plurality of encoded sequences match;
calculating, for each base of the at least two RLE sequences, a respective average of RLE counters of the at least two RLE sequences; and
combining the at least two RLE sequences in the hash table; and
transmit an output file including a plurality of decoded sequences generated based on the hash table.
|