CPC G06F 16/13 (2019.01) [G06F 16/27 (2019.01); G06F 16/285 (2019.01)] | 20 Claims |
1. A system for improved access to rows of data, each data row associated with a partition of a plurality of partitions, the data rows distributed in a plurality of files, wherein a file including data rows associated with different partitions of the plurality of partitions is an impure file, the system comprising:
a processor; and
a memory device that stores program code to be executed by the processor, the program code causing the processor to:
analyze a depth map that indicates the depth of each target partition of the plurality of partitions, the depth of each target partition based on a number of impure files having a data row associated with the respective target partition;
select a subset of impure files from a plurality of impure files based on the depth map analysis;
sort the data rows of the selected subset of the impure files according to a respective associated target partition of each of the data rows;
generate a set of disjoint partition range files based on the sorting; and
transfer each file of the disjoint partition range files to a respective target partition.
|