US 12,248,492 B2
Extract, transform, load monitoring platform
Chanakya Kaspa, Mckinney, TX (US); Divya Mehrotra, Frisco, TX (US); and Gregory Muzyn, Plano, TX (US)
Assigned to Capital One Services, LLC, McLean, VA (US)
Filed by Capital One Services, LLC, McLean, VA (US)
Filed on Oct. 26, 2023, as Appl. No. 18/494,917.
Application 18/494,917 is a continuation of application No. 17/303,167, filed on May 21, 2021, granted, now 11,847,130.
Prior Publication US 2024/0054144 A1, Feb. 15, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/25 (2019.01); G06F 16/22 (2019.01); G06F 16/26 (2019.01); G06N 5/04 (2023.01); G06N 20/00 (2019.01)
CPC G06F 16/254 (2019.01) [G06F 16/2282 (2019.01); G06F 16/26 (2019.01); G06N 5/04 (2013.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system for monitoring an extract, transform, load (ETL) pipeline, comprising:
one or more memories; and
one or more processors, coupled to the one or more memories, configured to:
receive configuration information associated with the ETL pipeline that includes one or more data sources and one or more data sinks;
execute one or more test cases within the ETL pipeline to generate one or more metrics associated with a quality or reliability of data in the ETL pipeline, the one or more test cases each being executed based on extracting respective test case data from a respective source table, transforming the respective test case data in the ETL pipeline, and loading the respective transformed test case data into a respective target table;
generate one or more predicted quality metrics associated with the ETL pipeline using a machine learning model and using the one or more metrics,
wherein the machine learning model is trained using historical execution data associated with one or more ETL jobs; and
generate a visualization that indicates data flow from the one or more data sources to the one or more data sinks and that indicates the one or more predicted quality metrics.