| CPC G06F 16/278 (2019.01) [G06F 16/254 (2019.01)] | 17 Claims |

|
1. A method, comprising:
receiving, by a computing system, a plurality of data sets sampled from data stored in a source system, wherein partitioning information is not available or not provided at the source system;
extracting, by the computing system and for each data set, a respective partitioning column, each respective partitioning column comprising a plurality of discrete values;
extracting, by the computing system and for each data set, a respective set of discrete values from the plurality of discrete values of the respective partitioning column;
generating, by the computing system, a respective cumulative deviation score for each respective set of discrete values;
comparing, by the computing system, the generated cumulative deviation scores for the respective sets of discrete values;
extracting, by the computing system, a final set of discrete values as the partitioning information to be used for partitioning the data based at least in part on the compared cumulative deviation scores; and
partitioning, by the computing system, the data stored in the source system based at least in part on the partitioning information.
|