CPC G06F 16/26 (2019.01) [G06F 16/2445 (2019.01); G06F 16/2458 (2019.01); G06F 16/24522 (2019.01); G06F 16/24545 (2019.01)] | 20 Claims |
1. A computer-implemented method, comprising:
using a first computer, establishing programmatic connections to a digitally stored first database comprising over one million records, each of the records comprising a plurality of columns, the first database being part of a HADOOP cluster that is programmatically coupled to a HIVE data warehouse manager and a PRESTO query engine;
using the first computer, reading a configuration file that specifies a plurality of tables in the first database;
using the first computer, for each particular table among the plurality of tables, forming and submitting a plurality of PRESTO queries to the first database, each of the PRESTO queries specifying one or more data aggregation operations, and in response thereto, receiving a plurality of result sets of records of the first database;
using the first computer, calculating a plurality of metadata metrics that characterize columns of the records in the result sets and storing the metadata metrics respectively in separate tables for VARCHAR column statistics, NUMERIC column statistics, DATE column statistics, based upon a particular data type among a plurality of different data types of the columns of the records in the result sets; and
using the first computer, generating presentation instructions which when rendered using a computer display device cause displaying one or more graphical visualizations in a graphical user interface of the computer display device.
|