US 12,346,329 B2
Range partitioned in-memory joins
Michael Warren Watzke, Fitchburg, WI (US); and Bhashyam Ramesh, Secimderabad (IN)
Assigned to Teradata US, Inc., San Diego, CA (US)
Filed by TERADATA US, INC., San Diego, CA (US)
Filed on Dec. 21, 2020, as Appl. No. 17/128,764.
Prior Publication US 2022/0197902 A1, Jun. 23, 2022
Int. Cl. G06F 16/2455 (2019.01); G06F 16/22 (2019.01); G06F 16/2457 (2019.01)
CPC G06F 16/2456 (2019.01) [G06F 16/2255 (2019.01); G06F 16/2282 (2019.01); G06F 16/24573 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A database system comprising:
a memory to store object metadata of a first table, the object metadata comprising object ranges of values of a first attribute in respective plural objects of a remote data store coupled to the database system over a network;
a plurality of processing engines comprising processors to access data of the plural objects over the network from the remote data store;
at least one processor to:
receive a join query comprising at least one join attribute to join a plurality of tables including the first table, the at least one join attribute comprising the first attribute;
based on the join query, sort the object ranges of the values of the first attribute in the object metadata based on midpoint values of the object ranges, to produce a sorted object metadata, wherein the sorting based on the midpoint values of the object ranges comprising ordering, in the sorted object metadata, a first object range having a first midpoint value before a second object range having a second midpoint value that is greater than the first midpoint value;
store the sorted object metadata in the memory; and
assign objects of the plural objects to the plurality of processing engines based on the sorted object metadata of the first table, wherein the plural objects contain tuples of the first table and are range partitioned across the plurality of processing engines based on the object ranges of the values of the first attribute that is part of the at least one join attribute in the join query, and
wherein a first processing engine of the plurality of processing engines is to:
retrieve tuples of the first table from objects of a subset of the plural objects, and add content of the retrieved tuples to an in-memory table;
retrieve, from the remote data store, tuples of a second table of the plurality of tables based on a range of values of the first attribute in the retrieved tuples of the first table; and
perform an in-memory join of the plurality of tables based on the retrieved tuples of the second table and the in-memory table.