| CPC G06F 21/577 (2013.01) [G06F 21/552 (2013.01)] | 20 Claims |

|
1. A method for training a machine learning model, the method comprising:
accessing a candidate training dataset for the machine learning model;
evaluating the candidate training dataset using a first filter layer, wherein evaluating the candidate training dataset using the first filter layer comprises:
applying at least one provenance filter to verify a provenance of the candidate training dataset by verifying at least one of: a signature of the candidate training dataset, a hash of the candidate training dataset, or a secure sockets layer/transport layer security (SSL/TLS) certificate of a source of the candidate training dataset; and
when the provenance of the candidate training dataset is verified:
evaluating the candidate training dataset using a second filter layer, wherein evaluating the candidate training dataset using the second filter layer comprises:
determining at least one content filter to apply to the candidate training dataset, wherein the at least one content filter is configured to determine whether the candidate training dataset contains poisoned data;
applying the at least one content filter to the candidate training dataset;
determining, based on applying the at least one content filter to the candidate training dataset, an integrity level of the candidate training dataset;
determining whether the integrity level satisfies a threshold integrity value; and
when the integrity level satisfies the threshold integrity value, training the machine learning model using the candidate training dataset.
|