US 11,853,323 B2
Adaptive distribution method for hash operations
Benoit Dageville, San Carlos, CA (US); Thierry Cruanes, San Mateo, CA (US); Marcin Zukowski, San Mateo, CA (US); Allison Waingold Lee, San Carlos, CA (US); and Philipp Thomas Unterbrunner, Belmont, CA (US)
Assigned to Snowflake Inc., Bozeman, MT (US)
Filed by SNOWFLAKE INC., Bozeman, MT (US)
Filed on Mar. 7, 2023, as Appl. No. 18/118,595.
Application 18/118,595 is a continuation of application No. 17/655,491, filed on Mar. 18, 2022, granted, now 11,620,308.
Application 17/655,491 is a continuation of application No. 17/358,988, filed on Jun. 25, 2021, granted, now 11,294,933, issued on Apr. 5, 2022.
Application 17/358,988 is a continuation of application No. 17/080,219, filed on Oct. 26, 2020, granted, now 11,048,721, issued on Jun. 29, 2021.
Application 17/080,219 is a continuation of application No. 16/858,518, filed on Apr. 24, 2020, granted, now 10,838,979, issued on Nov. 17, 2020.
Application 16/858,518 is a continuation of application No. 16/039,710, filed on Jul. 19, 2018, granted, now 10,997,201, issued on May 4, 2021.
Application 16/039,710 is a continuation of application No. 14/626,836, filed on Feb. 19, 2015, granted, now 10,055,472, issued on Aug. 21, 2018.
Claims priority of provisional application 61/941,986, filed on Feb. 19, 2014.
Prior Publication US 2023/0205783 A1, Jun. 29, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/27 (2019.01); G06F 9/50 (2006.01); G06F 16/14 (2019.01); G06F 16/21 (2019.01); G06F 16/22 (2019.01); G06F 16/951 (2019.01); G06F 16/182 (2019.01); G06F 16/23 (2019.01); G06F 16/2455 (2019.01); G06F 16/2458 (2019.01); G06F 16/9535 (2019.01); G06F 16/2453 (2019.01); H04L 67/568 (2022.01); G06F 16/28 (2019.01); G06F 16/25 (2019.01); A61F 5/56 (2006.01); G06F 16/9538 (2019.01); G06F 9/48 (2006.01); H04L 67/1095 (2022.01); H04L 67/1097 (2022.01)
CPC G06F 16/273 (2019.01) [A61F 5/566 (2013.01); G06F 9/4881 (2013.01); G06F 9/5016 (2013.01); G06F 9/5044 (2013.01); G06F 9/5083 (2013.01); G06F 9/5088 (2013.01); G06F 16/148 (2019.01); G06F 16/1827 (2019.01); G06F 16/211 (2019.01); G06F 16/221 (2019.01); G06F 16/2365 (2019.01); G06F 16/2456 (2019.01); G06F 16/2471 (2019.01); G06F 16/24532 (2019.01); G06F 16/24545 (2019.01); G06F 16/24552 (2019.01); G06F 16/254 (2019.01); G06F 16/27 (2019.01); G06F 16/283 (2019.01); G06F 16/951 (2019.01); G06F 16/9535 (2019.01); G06F 16/9538 (2019.01); H04L 67/1095 (2013.01); H04L 67/1097 (2013.01); H04L 67/568 (2022.05)] 20 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving a relational join query comprising a join operation, an indication of a first relation and a second relation to be joined, and a predicate, wherein the first relation and the second relation are partitioned over processing nodes of a cluster;
determining, by a processing device prior to starting distribution of the first or second relation to a plurality of probe operators of a probe operation, whether to distribute the first relation to the probe operation using a broadcast join or to distribute the second relation to the probe operation using a re-partitioning join, wherein the determining is based at least in part on an estimated size of the second relation and a cost metric;
based on the determining, distributing the first relation or the second relation to the processing nodes of the cluster associated with the probe operation; and
performing, at the processing nodes of the cluster associated with the probe operation, the relational join query using at least one of a hash join, a sort-merge join, or a nested-loop join to generate a third relation that contains all combinations of tuples in the first relation and the second relation that satisfy the predicate.