US 10,769,167 C1 (12,504th)
Federated computational analysis over distributed data
Pablo Prieto Barja, London (GB); Maria Chatzou, London (GB); Matija Sosic, London (GB); Martin Sosic, London (GB); Diogo Nuno Proenca Silva, London (GB); Bruno Filipe Ribeiro Goncalves, London (GB); Tiago Filipe Salgueiro De Jesus, London (GB); Olga Kruglova, London (GB); and Damyan Dobrev, London (GB)
Filed by LIFEBIT BIOTECH LIMITED, London (GB)
Assigned to LIFEBIT BIOTECH LIMITED, London (GB)
Reexamination Request No. 90/015,209, Mar. 9, 2023.
Reexamination Certificate for Patent 10,769,167, issued Sep. 8, 2020, Appl. No. 16/777,637, Jan. 30, 2020.
Ex Parte Reexamination Certificate issued on Feb. 1, 2024.
Int. Cl. G06F 16/25 (2019.01); G06F 16/28 (2019.01); G06N 20/00 (2019.01); G06F 21/62 (2013.01); H04L 9/08 (2006.01); G06F 16/2458 (2019.01); G06F 16/16 (2019.01); G06F 16/22 (2019.01); G06F 16/14 (2019.01)
CPC G06F 16/256 (2019.01) [G06F 16/285 (2019.01); G06F 21/6218 (2013.01); G06N 20/00 (2019.01); H04L 9/0891 (2013.01); G06F 16/148 (2019.01); G06F 16/164 (2019.01); G06F 16/2255 (2019.01); G06F 16/2471 (2019.01)]
OG exemplary drawing
AS A RESULT OF REEXAMINATION, IT HAS BEEN DETERMINED THAT:
The patentability of claims 1-8 is confirmed.
Claims 9 and 13-14 are determined to be patentable as amended.
Claims 10-12 and 15-20, dependent on an amended claim, are determined to be patentable.
New claims 21-49 are added and determined to be patentable.
1. A method of performing computational data analysis comprising:
importing a pipeline;
selecting a dataset, the dataset residing on a virtual file system and including data residing on one or more storage locations associated with the virtual file system;
selecting one or more compute resources to perform a pipeline analysis based at least on the imported pipeline and the dataset, the one or more compute resources being selected from a plurality of available compute resources associated with the one or more storage locations associated with the virtual file system;
configuring one or more secure clusters within the virtual file system, the one or more secure clusters including the selected one or more compute resources;
perform the pipeline analysis by streaming the data to the one or more secure clusters within the virtual file system; and
submitting resulting data generated from the pipeline analysis to the virtual file system.
9. A system for performing computational data analysis comprising:
one or more storage locations including input data, the input data on each of the one or more storage locations being accessible by a computing device using a virtual file system; and
an analysis operating system executing on one or more processors of the computing device, the analysis operating system being configured to [ select a dataset from the input data by presenting a user interface with the input data across the storage locations as if the data were in a single location and receive a user input selecting the dataset, the analysis operating system being further configured to ] select one or more compute resources to perform analysis on the [ data set of the ] input data using a pipeline and to create one or more secure clusters within the virtual file system including the one or more compute resources, the one or more compute resources being configured to perform the analysis on the [ data set of the ] input data using the pipeline by streaming [ data of ] the [ data set of the ] input data to the one or more secure clusters, wherein the one or more secure clusters are not accessible after creation, wherein resulting data generated from the analysis is submitted to the virtual file system.
13. The system of claim 9, wherein the analysis operating system is further configured to store results from the analysis of the [ dataset of the ] input data using the pipeline to a location on the virtual file system.
14. A method of performing computational data analysis comprising:
importing a pipeline;
[ selecting a dataset, the dataset residing on a virtual file system and including data residing on one or more of a plurality of storage locations associated with the virtual file system, wherein selecting the dataset comprises presenting, via a user interface, the data across the plurality storage locations associated with the virtual file system as if the data were in a single location and receiving user input selecting the dataset;]
presenting one or more combinations of one or more compute resources to perform a pipeline analysis based on the pipeline, wherein the one or more compute resources are associated with one or more of a [ the ] plurality of storage locations communicatively connected using a [ the ] virtual file system, and wherein the pipeline analysis uses input data located on the plurality of storage locations;
selecting one or more compute resources to perform the pipeline analysis based at least on a user input;
perform the pipeline analysis [ for the selected dataset ] using the one or more selected compute resources, the one or more compute resources being located within one or more secure clusters on the one or more storage locations; and
submitting resulting data generated from the pipeline analysis to one or more selected storage locations of the plurality of storage locations communicatively connected by the virtual file system.
[ 21. A method of performing computational data analysis comprising:
importing a pipeline;
selecting a dataset, the dataset residing on a virtual file system and including data residing on one or more storage locations associated with the virtual file system, wherein selecting the dataset comprises presenting, via a user interface, the data across the storage locations associated with the virtual file system as if the data were in a single location and receiving user input selecting the dataset;
selecting one or more compute resources to perform a pipeline analysis based at least on the imported pipeline and the dataset, the one or more compute resources being selected from a plurality of available compute resources associated with the one or more storage locations associated with the virtual file system;
configuring one or more secure clusters within the virtual file system, the one or more secure clusters including the selected one or more compute resources;
perform the pipeline analysis by streaming data from the dataset to the one or more secure clusters within the virtual file system; and
submitting resulting data generated from the pipeline analysis to the virtual file system.]
[ 22. The method of claim 21, further comprising determining that there is a problem with the pipeline analysis and destroying the one or more secure clusters before the pipeline analysis is complete.]
[ 23. The method of claim 21, further comprising preventing access to the data within the secure cluster by destroying one or more keys that provide access to the one or more secure cluster, wherein the keys are destroyed prior to the pipeline analysis.]
[ 24. The method of claim 21, further comprising receiving a second user selection input to a computing device of a configuration of tokens to access restricted pipelines.]
[ 25. The method of claim 21, wherein the user input includes an indication to add new data to a previous dataset.]
[ 26. The method of claim 21, further comprising presenting options for a user to save the dataset within the virtual file system.]
[ 27. A method of performing computational data analysis comprising:
importing a pipeline;
selecting a dataset, the dataset residing on a virtual file system and including data residing on one or more storage locations associated with the virtual file system;
selecting one or more compute resources to perform a pipeline analysis based at least on the imported pipeline and the dataset, the one or more compute resources being selected from a plurality of available compute resources associated with the one or more storage locations associated with the virtual file system;
configuring one or more secure clusters within the virtual file system by associating the one or more secure clusters with at least respective keys used to access the one or more secure clusters, the one or more secure clusters including the selected one or more compute resources;
destroying the respective keys associated with the one or more secure clusters prior to the pipeline analysis;
perform the pipeline analysis by streaming data from the dataset to the one or more secure clusters within the virtual file system; and
submitting resulting data generated from the pipeline analysis to the virtual file system.]
[ 28. The method of claim 27, further comprising determining that there is a problem with the pipeline analysis and destroying the one or more secure clusters before the pipeline analysis is complete.]
[ 29. The method of claim 27, wherein selecting the dataset comprises presenting, via a single user interface, the data across the storage locations associated with the virtual file system and receiving user input indicating the dataset.]
[ 30. The method of claim 27, further comprising preventing access to the data within the secure cluster by destroying one or more keys that provide access to the one or more secure cluster, wherein the keys are destroyed prior to the pipeline analysis.]
[ 31. The method of claim 27, further comprising receiving a user selection input to a computing device of a configuration of tokens to access restricted pipelines.]
[ 32. The method of claim 27, wherein the user selection input includes an indication to add new data to a previous dataset.]
[ 33. The method of claim 27, further comprising presenting options for a user to save the dataset within the virtual file system.]
[ 34. A method of performing computational data analysis comprising:
receiving a pipeline selection from a user via a first user interface;
receiving a user selection via a second user interface for a computing device of a dataset, the dataset residing on a virtual file system and including data residing on two or more storage locations associated with the virtual file system;
selecting one or more compute resources to perform a pipeline analysis based at least on the imported pipeline and the dataset, the one or more compute resources being selected from a plurality of available compute resources associated with the two or more storage locations associated with the virtual file system;
configuring one or more secure clusters within the virtual file system, the one or more secure clusters including the selected one or more compute resources;
perform the pipeline analysis by streaming data from the dataset to the one or more secure clusters within the virtual file system; and
submitting resulting data generated from the pipeline analysis to the virtual file system.]
[ 35. The method of claim 34, further comprising determining that there is a problem with the pipeline analysis and destroying the one or more secure clusters before the pipeline analysis is complete.]
[ 36. The method of claim 34, wherein selecting the dataset comprises presenting, via the second user interface, the data across the storage locations associated with the virtual file system and receiving a user input, wherein the dataset is presented as a single virtual location on the second user interface.]
[ 37. The method of claim 34, further comprising preventing access to the data within the secure cluster by destroying one or more keys that provide access to the one or more secure cluster, wherein the keys are destroyed prior to the pipeline analysis.]
[ 38. The method of claim 34, further comprising receiving a user selection input to a computing device of a configuration of tokens to access restricted pipelines.]
[ 39. The method of claim 34, wherein the user selection input includes an indication to add new data to a previous dataset.]
[ 40. The method of claim 34, further comprising presenting options for a user to save the dataset within the virtual file system.]
[ 41. The method of claim 1, further comprising presenting an estimated run time of the pipeline analysis for one or more combinations of the one or more compute resources, wherein machine learning is used to improve an estimated run time over time.]
[ 42. The method of claim 1, further comprising destroying the one or more secure clusters before the pipeline analysis is complete when there is a problem with the pipeline analysis, or when the pipeline analysis is complete.]
[ 43. The method of claim 1, further comprising creating the one or more secure clusters on a cloud compute resource, the one or more secure clusters including a monitor configured to return run time data during the pipeline analysis.]
[ 44. The method of claim 1, further comprising destroying one or more keys to the one or more secure cluster prior to the pipeline analysis.]
[ 45. The method of claim 1, wherein the one or more secure clusters are encrypted and protected by a key.]
[ 46. The method of claim 1, wherein the one or more secure clusters are inaccessible after creation.]
[ 47. A method for performing computational analysis comprising:
generating a virtual file system organizing data located on two or more storage locations across a network;
presenting on a first user interface a display of the organized data in the virtual file system as residing in a single location;
receiving a workflow for data analysis via a first user input;
receiving a data selection from a second user input, wherein the second user input is received via the first user interface;
presenting on a second user interface two or more options for compute resources to perform the workflow for the data selection;
receiving a selection of one or more compute resources via the second user interface;
configuring at least one secure cluster for analyzing the data selection based on the workflow, the at least one secure cluster including the selected one or more compute resources;
performing the workflow analysis by streaming the data from the two or more storage locations to the at least one secure cluster within the virtual file system, wherein the data is not accessible within the at least one secure cluster; and
outputting a resultant data to a storage location using the virtual file system.]
[ 48. The method of claim 47, wherein the two or more storage locations include network cloud storage locations that are hosted by different cloud storage providers.]
[ 49. The method of claim 47, further comprising destroying the at least one secure cluster once the resultant data is output to the storage location.]