US 12,314,258 B2
Runtime join pruning to improve join performance for tables
Dimitrios Tsirogiannis, Belmont, CA (US); and Zhaohui Zhang, Redwood City, CA (US)
Assigned to Snowflake Inc., Bozeman, MT (US)
Filed by Snowflake Inc., Bozeman, MT (US)
Filed on Apr. 29, 2024, as Appl. No. 18/649,509.
Application 18/649,509 is a continuation of application No. 18/358,402, filed on Jul. 25, 2023, granted, now 11,995,080.
Prior Publication US 2025/0036620 A1, Jan. 30, 2025
Int. Cl. G06F 16/2453 (2019.01)
CPC G06F 16/24537 (2019.01) [G06F 16/24542 (2019.01); G06F 16/24549 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A network-based database system comprising:
at least one hardware processor; and
a memory storing instructions that cause the at least one hardware processor to perform operations comprising:
generating, during a compaction process of a particular table, a set of expression properties for a set of columns of the particular table, the set of expression properties include metadata related to the set of columns of the particular table;
performing, after generating the set of expression properties, an insert operation of a set of values into the particular table;
generating a query plan based on a query, the query plan comprising a first set of nodes and a second set of nodes, the first set of nodes comprising a build side of a hash join, and the second set of nodes comprises a probe side of the hash join, the second set of nodes including a first node corresponding to a key value table scan operation, and a second node corresponding to a bloom filter operation;
executing, by an execution node using the generated query plan, the query, the executing comprising:
performing, during execution of the query by the execution node, a runtime range pruning process, the runtime range pruning process comprising:
determining a set of range sets for pruning, each range set including a set of columns from a first table, the set of columns being removed from undergoing a read operation as part of executing the query;
determining, based on a range bloom vector, a set of rows in a particular range set of the first table to avoid scanning in connection with the read operation; and
performing the read operation based a remaining set of rows from the particular range set of the first table.