US 11,989,171 B2
Data storage method and system
Jeremy Kong, London (GB); Grgur Petric Maretic, London (GB); Gokcan Ozakdag, London (GB); James Baker, London (GB); Sandor Van Wassenhove, London (GB); and Thomas Petracca, New York, NY (US)
Assigned to Palantir Technologies Inc., Palo Alto, CA (US)
Filed by Palantir Technologies Inc., Palo Alto, CA (US)
Filed on Nov. 8, 2021, as Appl. No. 17/521,481.
Application 17/521,481 is a continuation of application No. 16/402,700, filed on May 3, 2019, granted, now 11,169,987, issued on Nov. 9, 2021.
Claims priority of application No. 1816808 (GB), filed on Oct. 16, 2018.
Prior Publication US 2022/0207025 A1, Jun. 30, 2022
Int. Cl. G06F 7/00 (2006.01); G06F 16/22 (2019.01); G06F 16/23 (2019.01); G06F 16/27 (2019.01)
CPC G06F 16/2379 (2019.01) [G06F 16/221 (2019.01); G06F 16/278 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A method, performed by one or more processors, the method comprising:
receiving timestamp data representing transaction start times of each of a plurality of database transactions;
dividing the timestamp data into a plurality of partitioning quanta (nPQ), each partitioning quantum (PQ) of the plurality of partitioning quanta representing a range of sequential start timestamps; and
for each partitioning quantum (PQ) of the plurality of partitioning quanta (nPQ), allocating the timestamp data, such that sequentially adjacent start timestamps are allocated to different partitions of one or more physical storage systems for subsequent storage at the allocated partitions, wherein the different partitions are physically distinct partitions on a hard drive or solid state memory or physically different memory devices;
storing values representing the timestamp data, or data associated with the timestamp data, in the allocated partitions of the one or more physical storage systems, wherein the allocating comprises generating a data structure for each partitioning quantum (PQ) comprising N rows and M columns, each of the N rows corresponding to a respective one of the partitions (NP) and each of the M columns corresponding to a subset range of timestamps within each partitioning quantum (PQ), each of the N rows and M columns having respective row and column keys to enable access to the timestamp data;
receiving a range scan query corresponding to a range of the timestamp data;
locating relevant PQs from the plurality of partitioning quanta (nPQ) for the range of the timestamp data;
for each PQ of the relevant PQs, loading all relevant columns across all NP rows by determining column keys for the one or more PQs, wherein the column keys correspond to the range of the timestamp data; and
obtaining the timestamp data, or the data derived from the timestamp data, from the column keys,
wherein the row or column keys are encoded using variable-length encoding, and
wherein the values allocated to the respective partitions of the one or more physical storage media represent database transaction commit times (Tc) corresponding to the same transaction as the database transaction start times (Ts).