CPC G06F 16/2457 (2019.01) [G06F 9/3005 (2013.01); G06F 9/3867 (2013.01); G06F 9/541 (2013.01); G06F 9/544 (2013.01); G06F 16/211 (2019.01); G06F 16/221 (2019.01); G06F 16/24573 (2019.01); G06F 16/2465 (2019.01); G06F 16/2471 (2019.01); G06F 16/285 (2019.01); G06F 21/31 (2013.01); G06F 21/54 (2013.01); G06F 21/602 (2013.01); G06F 21/6227 (2013.01); G06F 21/6254 (2013.01); G06F 2221/2141 (2013.01)] | 18 Claims |
1. A data analytics system, comprising:
at least one processor; and
at least one non-transitory computer-readable medium containing instructions that, when executed by the at least one processor, cause the data analytics system to perform operations comprising:
creating at least one data storage;
creating a metadata store separate from the at least one data storage;
creating a flow storage;
creating an artifact storage storing a plurality of artifacts, the artifacts including scripts, executable binary, or modules usable by flow services; and
configuring a flow service using first received instructions, the first received instructions specifying a first flow and at least one of a first data source or a first data sink of the first flow, the configuring the flow service including:
obtaining the first flow from the flow storage according to the first received instructions, the first flow specifying a first data transformation;
obtaining an artifact implementing the first data transformation from the artifact storage;
obtaining metadata associated with the first flow from the metadata storage;
determining whether the artifact is authenticated for use with the first flow based, at least in part, on the metadata associated with the first flow;
in response to determining that the artifact is authenticated, executing the first flow, the first flow execution including:
obtaining input data from the first data source in the at least one data storage;
generating output data at least in part by validating, transforming, and serializing the input data using the metadata, the generating including executing the artifact to perform the first data transformation;
generating additional metadata describing the output data, the additional metadata describing a storage location of the output data;
providing the output data for storage in the first data sink at the storage location described by the additional metadata; and
providing the additional metadata for storage in the metadata storage; and
tearing down the first flow upon completion of the execution of the first flow; and
further configuring the flow service using second received instructions, the second received instructions specifying a second flow for displaying the output data provided by the first flow, the further configuring the flow service including:
obtaining the second flow from the flow storage according to the second received instructions;
obtaining the additional metadata generated by the first flow from the metadata storage; and
executing the second flow, the second flow execution including:
obtaining the output data generated by the first flow and stored in the first data sink at the storage location described by using the additional metadata; and
generating a view of at least some of the output data using the additional metadata, wherein the view is provided for display on a user device.
|