US 12,332,875 B1
Nested array batch processing
Shoumik Palkar, Mountain View, CA (US); Alexander Behm, Lafayette, CA (US); and David Cashman, Toronto (CA)
Assigned to Databricks, Inc., San Francisco, CA (US)
Filed by Databricks, Inc., San Francisco, CA (US)
Filed on Aug. 9, 2022, as Appl. No. 17/884,099.
Claims priority of provisional application 63/388,105, filed on Jul. 11, 2022.
Int. Cl. G06F 16/23 (2019.01); G06F 16/2453 (2019.01); G06F 16/2455 (2019.01)
CPC G06F 16/2386 (2019.01) [G06F 16/24542 (2019.01); G06F 16/24557 (2019.01)] 18 Claims
OG exemplary drawing
 
1. A system, comprising:
a memory;
one or more processors configured to:
obtain a query plan for processing input data in response to a query;
obtain the input data;
select a batch of the input data, wherein the batch of the input data comprises a column of a plurality of rows, wherein the column corresponds to an array type such that one or more rows of the plurality of rows comprises an array in a column position for the row;
create a metadata structure for the batch, wherein the metadata structure includes a mapping of child data corresponding to a particular row being stored in a contiguous memory subset of one or more contiguous parts of memory;
allocate the one or more contiguous parts of the memory for processing the batch;
process the batch of the input data in accordance with the metadata structure to generate resulting data; and
store each array of the resulting data for the batch in one of the one or more contiguous parts of the memory.