US 12,229,135 B2
Workload-aware data placement advisor for OLAP database systems
Urvashi Oswal, Fremont, CA (US); Jian Wen, Hollis, NH (US); Farhan Tauheed, Zurich (CH); Onur Kocberber, Thalwil (CH); Seema Sundara, Nashua, NH (US); and Nipun Agarwal, Saratoga, CA (US)
Assigned to Oracle International Corporation, Redwood Shores, CA (US)
Filed by Oracle International Corporation, Redwood Shores, CA (US)
Filed on Mar. 21, 2022, as Appl. No. 17/699,607.
Prior Publication US 2023/0297573 A1, Sep. 21, 2023
Int. Cl. G06F 16/2453 (2019.01); G06F 11/34 (2006.01); G06F 16/21 (2019.01); G06F 16/22 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/24544 (2019.01) [G06F 11/3409 (2013.01); G06F 16/211 (2019.01); G06F 16/2282 (2019.01); G06F 16/278 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
extracting one or more workload-specific features of a database workload running at a database system and one or more dataset-specific features of a database running on the database system;
wherein the one or more workload-specific features of the database workload characterize resource utilization of the database workload;
wherein the one or more dataset-specific features of the database characterize how data is logically organized within the database running on the database system;
identifying a plurality of candidate keys for determining how to partition data stored in the database across two or more computing nodes running the database system;
based at least in part, on the one or more workload-specific features, the one or more dataset-specific features, and the plurality of candidate keys, generating a set of candidate key combinations for partitioning data in the database over two or more computing nodes;
determining, using a machine-learning model, a particular candidate key combination from the set of candidate key combinations that optimizes query execution performance benefit based on the one or more workload-specific features and the one or more dataset-specific features of the database; and
generating one or more data placement commands to allocate one or more database tables of the database across the two or more computing nodes.