| CPC G06F 16/211 (2019.01) [G06F 16/2282 (2019.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A computer-implemented method for transferable feature engineering and synthetic data generation, the computer-implemented method comprising:
retrieving a plurality of data tables, wherein the plurality of data tables are heterogeneous in format and content;
removing at least one timestamp from the plurality of data tables to reduce noise within the plurality of data tables;
generating a variational auto-encoder (VAE) model;
training the VAE model on the plurality of data tables after removal of the at least one timestamp;
receiving an input data table;
generating a synthetic data table based on the input data table and the trained VAE model;
determining a subset of the plurality of data tables which have a lower column width than a maximum column width of the plurality of data tables;
inserting blank columns up to the maximum column width for the subset of the plurality of data tables; and
inserting a predetermined data value in the blank columns to prevent the VAE model from training the blank columns with the predetermined data.
|