US 12,111,809 B2
System and method for generating a multi dimensional data cube for analytics using a map-reduce program
Midda Dharmika Srinivasulu, Bangalore (IN); Ambuj Saxena, Bangalore (IN); and Amrita Patil, Bangalore (IN)
Assigned to ORACLE INTERNATIONAL CORPORATION, Redwood Shores, CA (US)
Filed by ORACLE INTERNATIONAL CORPORATION, Redwood Shores, CA (US)
Filed on Apr. 4, 2022, as Appl. No. 17/712,946.
Application 17/712,946 is a continuation of application No. 15/611,030, filed on Jun. 1, 2017, granted, now 11,294,876.
Prior Publication US 2022/0229826 A1, Jul. 21, 2022
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/22 (2019.01); G06F 12/0875 (2016.01); G06F 16/13 (2019.01); G06F 16/14 (2019.01); G06F 16/172 (2019.01); G06F 16/182 (2019.01); G06F 16/28 (2019.01)
CPC G06F 16/2264 (2019.01) [G06F 12/0875 (2013.01); G06F 16/13 (2019.01); G06F 16/148 (2019.01); G06F 16/172 (2019.01); G06F 16/182 (2019.01); G06F 16/283 (2019.01); G06F 2212/601 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system for generating a multidimensional data cube, comprising:
a computer comprising one or more microprocessors;
a data processing cluster executing on the one or more microprocessors and operable to:
receive, from one or more data sources, a source data comprising a plurality of columns of data;
combine each of a plurality of categorical columns within the source data with each of a plurality of numerical columns within the source data, to generate a plurality of data column combinations from the source data;
generate a plurality of key-value pairs corresponding to the plurality of data column combinations and row values in the source data;
collect values paired with a same key to determine one or more aggregate numerical values or frequency values within the source data;
generate a plurality of output files, including for each of the plurality of data column combinations generated for the source data, an output file that stores a pre-computed result of a query on the source data represented by the aggregate numerical values or the frequency values;
store the plurality of output files into a data cube, wherein the data cube stores the pre-computed results for the possible queries on the plurality of columns of the source data;
generate a mapping string for each of the plurality of output files in the data cube and indicative of a column of the source data; and
upon receiving another query from a client application, utilizing a generated mapping string to map the received the another query to one of the plurality of output files in order to provide, in response to the another query, a pre-computed result stored at the one of the plurality of output files.