US 12,222,964 B2
Database processing using hybrid key-value tables
Joshua Slocum, Austin, TX (US); and Evan J. Tschannen, Hillsborough, CA (US)
Assigned to Snowflake Inc., Bozeman, MT (US)
Filed by Snowflake Inc., Bozeman, MT (US)
Filed on Apr. 28, 2022, as Appl. No. 17/661,162.
Prior Publication US 2023/0350921 A1, Nov. 2, 2023
Int. Cl. G06F 16/28 (2019.01); G06F 16/22 (2019.01); G06F 16/2458 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/283 (2019.01) [G06F 16/2282 (2019.01); G06F 16/2477 (2019.01); G06F 16/27 (2019.01); G06F 16/289 (2019.01)] 24 Claims
OG exemplary drawing
 
1. A method for processing data on a distributed database:
storing, by one or more hardware processors, key-value data in a transactional database of the distributed database, the key-value data being stored in a key-value pair format in the transactional database;
querying the transactional database to determine a range size of the key-value data;
splitting, on the transactional database, the key-value data into range granules that cover data ranges, the range granules being indexed by range based on the range size, at least a portion of the range granules being of equal size;
replicating the range granules to a range-based object storage database of the distributed database, the replicating of the range granules including storing the range granules in an object format in the range-based object storage database;
receiving a read request on the transactional database;
determining that the read request is a read of the key-value data that exceeds a size limit;
in response to determining that the read request exceeds the size limit for the key-value data, identifying the range granules in the range-based object storage database that correspond to data requested in the read request, and transmitting a list of object data of the identified range granules to a plurality of execution nodes to process the read request;
receiving, from the range-based object storage database, the range granules that correspond to the list of object data using the plurality of execution nodes; and
generating, by the plurality of execution nodes, results data according to the read request using the range granules received from the range-based object storage database.