CPC G06F 16/2393 (2019.01) [G06F 11/3409 (2013.01); G06F 16/24554 (2019.01)] | 17 Claims |
1. A computer-implemented method of optimizing and persisting database views in a large-scale data processing system, comprising:
receiving queries made to a database through a processor-based search engine, wherein each query generates a respective database view;
tokenizing each query into respective constituent parts, each of which form nodes and edges in a heterogeneous graph such that common parts of queries are used as a materialized view;
generating a set of database maintained views generated by the queries, wherein each view comprises a dynamically generated virtual table holding data generated by queries applied to one or more tables in the database;
generating, for each query, a graph representation of the query that as a query graph that shows commonalities with other queries through the common parts indicated by the materialized view;
obtaining, for each generated view, telemetry information about a respective view including latency, memory space utilization, and processor utilization;
calculating a base score for each generated view using a scoring function based on a minimization objective combining as a sum, the processor utilization plus a query time for each respective database view plus the memory space utilization;
scoring, by a processor-based component, each view of the generated views based on the base score modified by the obtained information to determine which one or more of the generated views to make persistent;
maintaining, through a computer graphical user interface (GUI), the one or more persistent views to produce a set of optimized persistent database views; and
using a reconstructive self-supervised (SSL) model of graph neural networks (GNN) to generate a database view of other queries using the common parts to produce the set of optimized persistent database views to thereby save resources when processing queries by leveraging the commonalities.
|