CPC G06F 16/215 (2019.01) [G06F 11/3072 (2013.01); G06F 11/3428 (2013.01); G06F 16/24564 (2019.01); G06F 16/287 (2019.01)] | 18 Claims |
1. A computer-implemented method, comprising:
launching one or more scripts in a batch job on a hardware processor, wherein the one or more scripts describe a plurality of tables that describe a relational data model for a hierarchy of exploration data assets,
wherein the relational data model defines the hierarchy of exploration data assets and attributes at each level of the hierarchy,
wherein the hierarchy of exploration data assets comprise multiple databases holding data records obtained from a geophysical exploration when core samples are extracted from a plurality of wells drilled during the geophysical exploration,
wherein when the batch job is executed, the hardware processor performs operations of:
querying, using the plurality of tables, the hierarch of exploration data assets according to one or more data quality rules capable of identifying defective data records in the multiple databases of the hierarchy of exploration data assets, wherein the one or more data quality rules specify at least one physical relationship between the core samples whose measurements, as obtained from the geophysical exploration, are captured in at least two data records of a database;
identifying instances of defective data records that fail to meet the one or more data quality rules;
based on the identified instances of defective data records, calculating one or more data quality metrics for the multiple databases of the hierarchy of exploration data assets;
determining the one or more data quality metrics for a first database of the multiple databases;
determining the one or more data quality metrics a second database of the multiple databases;
based on comparing the one or more data quality metrics for the first database with the one or more data quality metrics for the second database, determining a comparative quality between the first database and the second database; and
generating an alert that includes: the comparative quality between the first database and the second database.
|