US 12,189,625 B2
Multi-cluster query result caching
Bogdan Ionut Ghit, Amsterdam (NL); Saksham Garg, Amsterdam (NL); Christian Stuart, Amsterdam (NL); and Christopher Stevens, St. Petersburg, FL (US)
Assigned to Databricks, Inc., San Francisco, CA (US)
Filed by Databricks, Inc., San Francisco, CA (US)
Filed on Jul. 14, 2023, as Appl. No. 18/222,343.
Application 18/222,343 is a continuation of application No. 18/221,735, filed on Jul. 13, 2023.
Claims priority of provisional application 63/483,458, filed on Feb. 6, 2023.
Prior Publication US 2024/0265011 A1, Aug. 8, 2024
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/24 (2019.01); G06F 16/2453 (2019.01); G06F 16/25 (2019.01); G06F 16/28 (2019.01)
CPC G06F 16/24539 (2019.01) [G06F 16/24542 (2019.01); G06F 16/256 (2019.01); G06F 16/285 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A method comprising:
receiving, from a user device, a request to perform a query operation, the query operation defined by a set of operations on data from one or more data tables;
accessing one or more clusters on a cloud platform, a cluster configured with a driver node and a set of executor nodes;
determining whether results of the query operation are stored in an in-memory cache of the cluster;
responsive to determining that the results of the query operation are not in the in-memory cache, determining whether the results of the query operation are in a cloud storage cache;
responsive to determining that results of the query operation are not in the cloud storage cache, executing the query operation with the set of executor nodes;
providing the results of executing the query operation to the user device;
generating a cache key for the query operation; and
generating a manifest file for the query operation and storing the results of the query operation in one or more result files, the manifest file associated with the cache key and including information on the results of the query operation.