| CPC G06F 16/1752 (2019.01) [G06F 3/0608 (2013.01); G06F 3/0641 (2013.01); G06F 3/067 (2013.01)] | 14 Claims |

|
1. A system for adaptive bandwidth-efficient data encoding, comprising:
a computing device comprising a processor and a memory;
a sequence analyzer comprising a first plurality of programming instructions stored in the memory and operable on the processor, wherein the first plurality of programming instructions, when operating on the processor, cause the processor to:
receive a sequence dataset;
scan the sequence dataset and maintain a count of the number of unique characters contained within the sequence dataset;
for each occurrence of a unique character which causes the count of the number of unique characters to reach a value equal to a power of two, indicate a position in the sequence dataset corresponding to the unique character;
calculate, for each of the indicated positions, a compaction ratio that would be obtained by dividing the sequence dataset into one of a plurality of segments at one of the indicated positions;
deconstruct the sequence dataset into a plurality of deconstructed sourceblocks at the positions that yield the best compaction ratio; and
pass the plurality of deconstructed sourceblocks to a data deconstruction engine;
an adaptive sourceblock optimizer configured to determine and dynamically adjust an optimal sourceblock size based on sequence complexity, alphabet size, and frequency distribution of characters;
a data deconstruction engine comprising a second plurality of programming instructions stored in the memory and operable on the processor, wherein the second plurality of programming instructions, when operating on the processor, cause the processor to:
receive the plurality of deconstructed sourceblocks from the sequence analyzer;
deconstruct the sequence dataset into sourceblocks using the optimal sourceblock size from the adaptive sourceblock optimizer; and
create a plurality of codewords for storage or transmission of the sequence dataset.
|