US 12,332,908 B2
Compact error tracking logs for ETL
Pascal Gonel, Paris (FR); Kamel Msalmi, Paris (FR); and Alexandre Codjovi, Paris (FR)
Assigned to SAP SE, Walldorf (DE)
Filed by SAP SE, Walldorf (DE)
Filed on Aug. 2, 2022, as Appl. No. 17/816,829.
Prior Publication US 2024/0045880 A1, Feb. 8, 2024
Int. Cl. G06F 16/25 (2019.01); G06F 16/27 (2019.01); G06F 16/28 (2019.01)
CPC G06F 16/254 (2019.01) [G06F 16/273 (2019.01); G06F 16/283 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
initiating an extract transform and load from a first system to a second system;
in response to the initiating, performing the extract transform and load by extracting input data at the first system, transforming the input data using one or more rules to form generated data, and loading the generated data into the second system;
during at least a portion of the extract transform and load,
storing a data object including a snapshot and a log having a compact form, wherein the snapshot includes the generated data and the log indicates, for a row in the generated data, which of the one or more rules was used to form the row in the generated data, wherein a format of the log includes a first identifier followed by a second identifier, wherein the first identifier identifies which rule of a first mapping was used to form the row of the generated data, and wherein the second identifier identifies which rule of a second mapping was used to form the row of the generated data, and
storing an aggregation table including an aggregated row identifier identifying a row of the generated data and further including one or more source row identifiers identifying which one or more rows in the input data were used to form the row of the generated data;
in response to a received query, generating a response by at least accessing the aggregation table and the data object, wherein the response includes, based on the access to the data object, a portion of the input data which was used to generate a given row in the generated data, and wherein the response further includes an indication, based on the access to the aggregation table, regarding which of the one or more rules were applied to the given row; and
in response to presenting the response, receiving feedback in the form of a modification to at least one of the one or more rules of the extract transform and load, and re-executing the extract transform and load with the modification.