CPC G06F 16/273 (2019.01) [A61F 5/566 (2013.01); G06F 9/4881 (2013.01); G06F 9/5016 (2013.01); G06F 9/5044 (2013.01); G06F 9/5083 (2013.01); G06F 9/5088 (2013.01); G06F 16/148 (2019.01); G06F 16/1827 (2019.01); G06F 16/211 (2019.01); G06F 16/221 (2019.01); G06F 16/2365 (2019.01); G06F 16/24532 (2019.01); G06F 16/24545 (2019.01); G06F 16/24552 (2019.01); G06F 16/2456 (2019.01); G06F 16/2471 (2019.01); G06F 16/254 (2019.01); G06F 16/27 (2019.01); G06F 16/283 (2019.01); G06F 16/951 (2019.01); G06F 16/9535 (2019.01); G06F 16/9538 (2019.01); H04L 67/1095 (2013.01); H04L 67/1097 (2013.01); H04L 67/568 (2022.05)] | 20 Claims |
1. A method, comprising:
receiving a relational join query comprising a join operation and an indication of a first relation and a second relation to be joined, wherein the first relation and the second relation are partitioned over processing nodes of a cluster;
re-partitioning the first relation to a plurality of build operator instances;
determining, by a processing device, whether to replicate the first relation to each of a plurality of probe operator instances or to re-partition the second relation between the plurality of probe operator instances, wherein the determining is based at least in part on an actual size of the first relation and an estimated size of the second relation;
based on the determining, distributing the first relation or the second relation over a plurality of communication links of a data communication network to the processing nodes of the cluster associated with the probe operation; and
performing, at the processing nodes of the cluster associated with the probe operation, the relational join query using at least one of a hash join, a sort-merge join, or a nested-loop join to generate a third relation.
|