US 11,941,140 B2
Platform management of integrated access of public and privately-accessible datasets utilizing federated query generation and query schema rewriting optimization
Bryon Kristen Jacob, Austin, TX (US); David Lee Griffith, Austin, TX (US); Triet Minh Le, Austin, TX (US); Shad William Reynolds, Austin, TX (US); and Arthur Albert Keen, Austin, TX (US)
Assigned to data.world, Inc., Austin, TX (US)
Filed by data.world, Inc., Austin, TX (US)
Filed on Jun. 30, 2022, as Appl. No. 17/854,686.
Application 17/854,686 is a continuation of application No. 16/457,759, filed on Jun. 28, 2019, granted, now 11,386,218.
Application 16/457,759 is a continuation of application No. 16/428,456, filed on May 31, 2019, granted, now 11,093,633.
Application 16/428,456 is a continuation in part of application No. 15/454,955, filed on Mar. 9, 2017, granted, now 10,691,710.
Application 15/454,955 is a continuation in part of application No. 15/454,969, filed on Mar. 9, 2017, granted, now 10,747,774.
Application 15/454,969 is a continuation in part of application No. 15/454,923, filed on Mar. 9, 2017, granted, now 10,353,911.
Application 15/454,923 is a continuation in part of application No. 15/454,981, filed on Mar. 9, 2017, granted, now 10,645,548.
Application 15/454,981 is a continuation of application No. 15/439,911, filed on Feb. 22, 2017, granted, now 10,438,013.
Application 16/428,456 is a continuation of application No. 15/439,911, filed on Feb. 22, 2017, granted, now 10,438,013.
Application 16/457,759 is a continuation in part of application No. 15/186,520, filed on Jun. 19, 2016, granted, now 10,346,429.
Application 15/186,520 is a continuation in part of application No. 15/186,516, filed on Jun. 19, 2016, granted, now 10,452,677.
Application 15/439,911 is a continuation in part of application No. 15/186,515, filed on Jun. 19, 2016, granted, now 10,515,085.
Application 15/186,515 is a continuation in part of application No. 15/186,520, filed on Jun. 19, 2016, granted, now 10,346,429.
Application 15/186,520 is a continuation in part of application No. 15/186,516, filed on Jun. 19, 2016, granted, now 10,452,677.
Application 16/457,759 is a continuation in part of application No. 15/186,517, filed on Jun. 19, 2016, granted, now 10,324,925.
Application 15/439,911 is a continuation in part of application No. 15/186,517, filed on Jun. 19, 2016, granted, now 10,324,925.
Application 16/457,759 is a continuation in part of application No. 15/186,515, filed on Jun. 19, 2016, granted, now 10,515,085.
Application 15/186,515 is a continuation in part of application No. 15/186,514, filed on Jun. 19, 2016, granted, now 10,102,258.
Application 15/439,911 is a continuation in part of application No. 15/186,514, filed on Jun. 19, 2016, granted, now 10,102,258.
Application 15/186,514 is a continuation in part of application No. 15/186,519, filed on Jun. 19, 2016, granted, now 10,699,027.
Application 16/457,759 is a continuation in part of application No. 15/186,519, filed on Jun. 19, 2016, granted, now 10,699,027.
Prior Publication US 2023/0127572 A1, Apr. 27, 2023
This patent is subject to a terminal disclaimer.
Int. Cl. G06F 16/242 (2019.01); G06F 16/21 (2019.01); G06F 16/2453 (2019.01); G06F 16/901 (2019.01); G06F 21/62 (2013.01); G06N 3/08 (2023.01); G06N 5/022 (2023.01); G06N 5/04 (2023.01)
CPC G06F 21/6218 (2013.01) [G06F 16/213 (2019.01); G06F 16/2423 (2019.01); G06F 16/24534 (2019.01); G06F 16/24542 (2019.01); G06F 16/9024 (2019.01); G06F 21/6227 (2013.01); G06N 3/08 (2013.01); G06N 5/022 (2013.01); G06N 5/04 (2013.01)] 14 Claims
OG exemplary drawing
 
1. A method, comprising:
receiving a query at a dataset access platform, the query being formatted according to a first data schema, the query comprising data associated with a request to access a dataset;
generating a copy of the query;
identifying whether the query is a master or a replica as the copy of the query;
identifying a datastore for storing the query as either the master or the copy, or both;
updating a graph as a data model associated with the query to identify elements to distinguish the copy for data operations to be performed;
parsing the copy of the query in the first schema, the parsing being performed by an inference engine configured to identify the dataset, to infer an attribute associated with the query, and to generate one or more data links between the dataset and another dataset accessible by the dataset access platform, wherein parsing the copy of the query further includes parsing the query into a data structure including an abstract syntax tree associated with a target query language;
rewriting the copy of the query in a second schema including a triples-based format and, if the attribute indicates the query is configured to provide authentication data to access the dataset, the rewriting comprising converting the copy of the query into a triple and converting the attribute into another triple;
optimizing rewriting the copy of the query,
determining one or more property paths to the another dataset;
identifying a database engine to execute the query, the database engine is configured to be topologically internal to a data network associated with the dataset access platform; and
converting other data to a further triple, the other data and the further triple being associated with a path configured to route the query or the copy of the query in the second schema including the triples-based format from the dataset access platform to retrieve query results from a target database configured to store the dataset as graph-based data,
wherein the one or more property paths are determined by performing a comparison of another attribute associated with each of the one or more property paths to a threshold to identify an optimal path to run the query or the copy of the query.