US 12,271,381 B2
Query execution via communication with an object storage system via an object storage communication protocol
S. Christopher Gladwin, Chicago, IL (US); George Kondiles, Chicago, IL (US); Jason Arnold, Chicago, IL (US); Greg R. Dhuse, Chicago, IL (US); and Joseph Jablonski, Chicago, IL (US)
Assigned to Ocient Holdings LLC, Chicago, IL (US)
Filed by Ocient Holdings LLC, Chicago, IL (US)
Filed on Jan. 3, 2024, as Appl. No. 18/403,002.
Claims priority of provisional application 63/482,497, filed on Jan. 31, 2023.
Claims priority of provisional application 63/482,485, filed on Jan. 31, 2023.
Claims priority of provisional application 63/482,504, filed on Jan. 31, 2023.
Prior Publication US 2024/0256541 A1, Aug. 1, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/245 (2019.01); G06F 16/22 (2019.01); G06F 16/2453 (2019.01); G06F 16/25 (2019.01)
CPC G06F 16/24544 (2019.01) [G06F 16/22 (2019.01); G06F 16/24532 (2019.01); G06F 16/24537 (2019.01); G06F 16/24542 (2019.01); G06F 16/258 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method for execution by a data processing system comprising:
determining a query for execution;
generating a query operator execution flow for the query that includes a first at least one operator serially before a second at least one operator; and
executing the query to generate a query resultant based on:
executing the first at least one operator of the query operator execution flow based on:
generating, based on the query, a request for rows in accordance with an object storage communication protocol, wherein the request for rows indicates filtering parameter data;
sending the request indicating the filtering parameter data to an object storage system in accordance with the object storage communication protocol, wherein the object storage system stores a plurality of records via a plurality of objects in memory resources of the object storage system and further stores configuration data mapping storage of the plurality of records of a plurality of datasets via the plurality of objects; wherein the object storage system processes the request to generate a filtered row set via processing the request for rows based on:
executing a record identification pipeline for execution based on applying the filtering parameter data and the configuration data, wherein a proper subset of the plurality of records meeting the filtering parameter data is identified based on executing the record identification pipeline by accessing at least one object of the plurality of objects; and
receiving a response from the object storage system in accordance with the object storage communication protocol indicating the filtered row set generated by the object storage system as the proper subset of the plurality of records stored by the object storage system that compare favorably to the filtering parameter data based on the object storage system processing the request; and
executing the second at least one operator of the query operator execution flow via a plurality of parallelized nodes of a query execution plan based on:
processing the filtered row set indicated in the response in accordance with the second at least one operator to produce the query resultant based on multiple nodes of the plurality of parallelized nodes each processing corresponding subsets of the filtered row set in parallel with other ones of the multiple nodes.