| CPC G06F 16/1744 (2019.01) [H03M 7/3064 (2013.01)] | 18 Claims |

|
1. A method, comprising:
receiving an input file into a compression engine, wherein the input file includes multiple dimensions each associated with an axis;
aligning the input file along each of the axes at the same time to create a compression tensor that includes sequences in each of the axes, wherein aligning the input file includes inserting gaps into at least some of the sequences in each of the multiple axes such that each column of the multiple dimensions of the compression tensor includes identical data or both identical data and at least one gap, wherein each of the multiple dimensions of the compression tensor is associated with a consensus sequence representing columns of a dimension, wherein each dimension includes rows and columns, wherein each consensus sequence includes data for a corresponding dimension of the compression tensor;
wherein aligning the input file includes splitting the input file a plurality of time into multiple splits where each split includes a different portion of the input file, wherein each of the rows includes a split of the multiple splits;
determining a consensus tensor from the compression tensor, wherein the consensus tensor includes the multiple dimensions, wherein each dimension of the consensus tensor includes a corresponding consensus sequence; and
generating a compressed file that include the consensus tensor and pointer lists, wherein each pointer list identifies a subsequence of the consensus tensor, the subsequence having a polygon shape in a portion of the consensus tensor.
|