CPC G06F 16/254 (2019.01) [G06F 16/211 (2019.01); G06F 16/221 (2019.01)] | 20 Claims |
1. A computer-implemented method for optimizing a flow of data within extract, transform, load (ETL) data processing pipelines, the method comprising:
identifying which database columns from a source database are to be transformed in data processing stages of a processing segment of a ETL data processing pipeline and which database columns from said source database are not to be transformed in said data processing stages of said processing segment of said ETL data processing pipeline;
grouping database columns to be transformed into a processing schema;
performing transformations on said database columns of said processing schema;
grouping database columns that are not be transformed into a non-processing schema;
creating a large object data type to reference said non-processing schema; and
creating and inserting an identifier in said data processing stages to identify said large object data type thereby avoiding copying of said database columns that are not to be transformed in said data processing stages.
|