CPC G06F 3/0647 (2013.01) [G06F 3/0604 (2013.01); G06F 3/0679 (2013.01)] | 18 Claims |
1. A method of shuffling data, the method comprising:
shuffling a first batch of data using a first memory on a first level of a memory hierarchy to generate a first batch of shuffled data;
shuffling a second batch of data using the first memory to generate a second batch of shuffled data;
storing the first batch of shuffled data and the second batch of shuffled data in a second memory on a second level of the memory hierarchy;
partitioning a portion of the first batch of data which was streamed from the first memory, wherein partitioning includes generating a partition ID for a record in the first batch of data and the partition ID is used in sorting;
grouping a portion of the second batch of data in parallel with the partitioning of the portion of the first batch of data;
fetching a key column, and streaming a key from the key column; and
partitioning the key in parallel with the fetching.
|