| CPC G06F 21/6254 (2013.01) [G06F 21/602 (2013.01); G06N 7/01 (2023.01)] | 20 Claims |

|
1. A method for orchestrating automated data processing and transformation, comprising:
receiving, by a centralized orchestrator, a request to process a dataset associated with a client;
initiating, by the orchestrator, a data ingestion process to obtain sample data from the dataset, wherein the data ingestion process is executed on client premises or in a cloud environment;
analyzing, by a semantic analysis module controlled by the orchestrator, the sample data to determine semantic types of data fields within the dataset;
generating, by a transformation module controlled by the orchestrator, data transformation instructions based on the determined semantic types;
deploying, by the orchestrator, a data processing pipeline to a client-controlled environment, wherein the pipeline is executed within the client premises;
configuring, by the orchestrator, privacy preservation parameters for the data processing pipeline to identify and obfuscate potential personally identifiable information (PII) in the dataset;
instructing the data processing pipeline to apply the data transformation instructions and privacy preservation parameters to the dataset;
determining, by a configuration module controlled by the orchestrator, data storage configurations for the transformed dataset;
directing the storage of the transformed dataset according to the determined data storage configurations, wherein the storage occurs in a client-controlled environment or a cloud environment;
generating, by a machine learning module that is executed in the cloud or on client premises, a machine learning model based on the transformed dataset; and
storing the machine learning model in a model repository accessible to the client.
|