| CPC G06F 16/24578 (2019.01) [G06F 16/2282 (2019.01)] | 20 Claims |

|
1. A system, comprising:
one or more memory units comprising one or more instructions;
one or more processors communicatively coupled to the one or more memory units, the one or more processors configured, upon executing the one or more instructions, to:
for each of one or more lookup tables:
access a table that comprises a first set of data, wherein the first set of data comprises a plurality of first independent variables, wherein all of the plurality of the first independent variables are one-hot encoded;
rank the plurality of first independent variables, and sort the plurality of first independent variables into a first order based on the rank;
distribute the sorted first independent variables of the second table into a plurality of first bands of first varying resolution;
generate a first hash for each record of the first set of data at each level of the plurality of first bands;
generate a respective lookup table of the one or more lookup tables based on the first hashes;
access a table that comprises a second set of data, wherein the second set of data comprises a plurality of second independent variables, wherein all of the plurality of the second independent variables are one-hot encoded;
rank and sort the plurality of second independent variables into a second order that is the same as the first order of the sorted first independent variables;
distribute the sorted second independent variables into a plurality of second bands of a second varying resolution, wherein the distribution of the sorted second independent variables into the plurality of second bands of the second varying resolution is the same as the distribution of the sorted first independent variables into the plurality of first bands of the first varying resolution, wherein the plurality of second bands of the second varying resolution is the same as the plurality of first bands of the first varying resolution;
generate a second hash for each record of the second set of data at each level of the plurality of second bands, so as to create a model-ready table; and
join the respective lookup table to the model-ready table on matching hashes of the first and second hashes.
|