| CPC G06N 3/045 (2023.01) [G06F 9/5072 (2013.01); G06N 3/08 (2013.01); G06N 20/00 (2019.01); H04L 9/3239 (2013.01); H04L 9/0637 (2013.01)] | 17 Claims |

|
1. A method performed by at least one hardware processor, comprising:
receiving a plurality of datasets, a dataset in the plurality of datasets associated with a data provider;
running multiple instances of a machine learning algorithm to create corresponding multiple machine learning models trained for a specific task in a given domain, each of the multiple instances using a different subset of the plurality of datasets in training the corresponding machine learning model;
running the multiple machine learning models with input data, wherein the multiple machine learning models produce corresponding multiple outcomes;
determining a candidate machine learning model from the multiple machine learning models based on comparing each of the multiple outcomes with ground-truth output;
determining a value associated with the different subset of the plurality of datasets used as a training set based on the comparing of each of the multiple outcomes with ground-truth output, the value corresponding to quality associated with the different subset of the plurality of datasets, the quality associated with the different subset of the plurality of datasets being determined based on contribution of the different subset of the plurality of datasets to how accurately a machine learning model that is trained using the different subset of the plurality of datasets, produced an outcome;
creating a smart contract comprising the value and the different subset of the plurality of datasets, the smart contract protecting security of training data used by the machine learning algorithm; and
recording the smart contract in a blockchain,
wherein the machine learning algorithm is selected by a user via a user interface coupled with the at least one hardware processor, the value further provided with a percentage of different data sets that make up the different subset and the data providers associated with the different data sets.
|