US 11,941,028 B2
Efficient process for creating range-partitioned indexes ensuring uniform document distribution
Nawab Zada Asad Iqbal, Union City, CA (US)
Assigned to Box, Inc., Redwood City, CA (US)
Filed by Box, Inc., Redwood City, CA (US)
Filed on Jan. 10, 2019, as Appl. No. 16/244,289.
Prior Publication US 2020/0226149 A1, Jul. 16, 2020
Int. Cl. G06F 16/27 (2019.01); G06F 16/22 (2019.01)
CPC G06F 16/278 (2019.01) [G06F 16/2272 (2019.01); G06F 16/2282 (2019.01)] 15 Claims
OG exemplary drawing
 
1. A method for distributing records among storage partitions based on range-partitioned indexes, the method comprising:
maintaining, by a storage system, a table of a plurality of records indexed based on a plurality of original partitioning keys in the table of records, wherein each original partitioning key comprises an identifier uniquely identifying a record of the plurality of records and an owner of the record, wherein the storage system comprises a multi-tenant storage system, wherein the records comprise records associated with a plurality of tenants, and wherein the owner of each record comprises one of a plurality of users associated with one of the plurality of tenants;
initializing, by a partitioning function executed by the storage system, a plurality of counters, each counter of the plurality of counters associated with a sub-range of a plurality of equal-length sub-ranges in a total range of key values for a secondary index partitioning key, wherein the secondary index partition key comprises a foreign key uniquely identifying an association between owners of two or more records of the plurality of records, wherein the association between records uniquely identified by the secondary index partition key comprises an identification of the association between the owner of each record and one of the plurality of tenant, wherein a sub-range for each record is identified by a pre-determined number of first bits of the secondary index partition key and the number of sub-ranges is based on a range of values represented by the pre-determined number of first bits of the secondary index partition key;
reading, by the partitioning function executed by the storage system, each record of the table of records;
accumulating, by the partitioning function executed by the storage system, in each counter of the plurality of counters, a count of a number of records in the associated sub-range based on the pre-determined number of first bits of the secondary index partition key for each of the records of the table of records;
determining, by the partitioning function executed by the storage system, a number of records per partition based on a total number of records in the total range of key values for the secondary index partitioning key and a number of available partitions in the storage system; and
distributing, by the partitioning function executed by the storage system, the records stored by the storage system to the available partitions in the storage system based on the number of records in each sub-range while keeping records having a same secondary index partitioning key in a least number of partitions.