US 11,947,554 B2
	Loading collaborative datasets into data stores for queries via distributed computer networks
Bryon Kristen Jacob, Austin, TX (US); David Lee Griffith, Austin, TX (US); Triet Minh Le, Austin, TX (US); Jon Loyens, Austin, TX (US); Brett A. Hurt, Austin, TX (US); and Arthur Albert Keen, Austin, TX (US)
Assigned to data.world, Inc., Austin, TX (US)
Filed by data.world, Inc., Austin, TX (US)
Filed on May 17, 2022, as Appl. No. 17/745,868.
Application 17/745,868 is a continuation of application No. 16/899,542, filed on Jun. 11, 2020, granted, now 11,334,625.
Application 16/899,542 is a continuation of application No. 15/186,519, filed on Jun. 19, 2016, granted, now 10,699,027.
Prior Publication US 2023/0376496 A1, Nov. 23, 2023
Int. Cl. G06F 16/00 (2019.01); G06F 16/21 (2019.01); G06F 16/2458 (2019.01); G06F 16/28 (2019.01)

CPC G06F 16/2471 (2019.01) [G06F 16/219 (2019.01); G06F 16/285 (2019.01)]

20 Claims

1. A method, comprising:

identifying via a network one or more distributed data repositories associated with distributed computer networks, at least one distributed data repository including a cloud-based data store configured to store one or more atomized datasets in a triple data format;

causing to load via the network into the cloud-based data store the one or more atomized datasets in the triple data format; and

implementing one or more portions of an application associated with the distributed computer networks to generate one or more queries, the one or more portions of the application configured to perform data operations associated with a dataset including:

converting the dataset from a first data format to the triple data format, which is configured to form a portion of a graph;

selecting a data store type associated with the cloud-based data store based on the at least one resource requirement;

performing a load operation associated with the one or more atomized datasets as a function of the cloud-based data store based on the at least one resource requirement;

receiving a query to access the dataset;

classifying at least a portion of the query directed to the dataset to determine a classification type, whereby the classification type is associated with a type of query for a query portion; and

applying the portion of the query as a sub-query to the one or more distributed data repositories including the cloud-based data store configured to store the dataset as at least one atomized dataset in the triple data format.