US 11,704,409 B2
	Post-training detection and identification of backdoor-poisoning attacks
David Jonathan Miller, State College, PA (US); and George Kesidis, State College, PA (US)
Assigned to Anomalee Inc., State College, PA (US)
Filed by Anomalee Inc., State College, PA (US)
Filed on May 2, 2021, as Appl. No. 17/246,689.
Application 17/246,689 is a continuation in part of application No. 17/002,286, filed on Aug. 25, 2020.
Application 17/002,286 is a continuation in part of application No. 16/885,177, filed on May 27, 2020, granted, now 11,514,297.
Claims priority of provisional application 62/854,078, filed on May 29, 2019.
Prior Publication US 2021/0256125 A1, Aug. 19, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 5/04 (2023.01); G06F 21/56 (2013.01); G06N 5/045 (2023.01); G06N 20/00 (2019.01)

CPC G06F 21/566 (2013.01) [G06N 5/045 (2013.01); G06N 20/00 (2019.01); G06F 2221/033 (2013.01)]

20 Claims

1. A computer-implemented method for detecting backdoor poisoning of a machine-learned decision-making system (MLDMS), comprising:

receiving the MLDMS, wherein the MLDMS operates on input data samples to produce an output decision that leverages a set of parameters that are learned from a training dataset that may be backdoor-poisoned;

receiving a set of clean (unpoisoned) data samples that are mapped by the MLDMS to a plurality of output values;

using the MLDMS and the clean data samples, estimating a set of potential backdoor perturbations such that incorporating a potential backdoor perturbation into a subset of the clean data samples induces an output decision change;

comparing the set of potential backdoor perturbations to determine a candidate backdoor perturbation based on at least one of perturbation sizes and corresponding output changes; and

using the candidate backdoor perturbation to determine whether the MLDMS has been backdoor-poisoned.