| CPC G06F 16/215 (2019.01) | 20 Claims |

|
1. A computer-implemented method, comprising:
generating a snapshot of a table-formatted dataset, wherein the snapshot provides a sample comprising a reduced number of rows of the table-formatted dataset such that each column variation of the table-formatted dataset is included in the snapshot;
executing a predetermined collection of data quality (DQ) rules on the snapshot;
determining one or more performance statistics for each of the DQ rules, wherein the performance statistics indicate a likelihood that a DQ rule determines a data quality deficiency;
generating, based on the performance statistics, a subset of the DQ rules, wherein each DQ rule of the subset is selected based on the likelihood that the DQ rule selected detects a quality deficiency; and
generating an order of executing the subset of DQ rules selected, wherein the order generated specifies a sequence for applying each DQ rule of the subset to the table-formatted dataset.
|