US 12,217,176 B2
Automatic identification and classification of adversarial attacks
Eric Piegert, Kriftel (DE); Michelle Karg, Lindau (DE); and Christian Scharfenberger, Lindau (DE)
Assigned to Conti Temic microelectronic GmbH, Nuremberg (DE)
Appl. No. 17/593,558
Filed by Conti Temic microelectronic GmbH, Nuremberg (DE)
PCT Filed Mar. 17, 2020, PCT No. PCT/DE2020/200018
§ 371(c)(1), (2) Date Sep. 21, 2021,
PCT Pub. No. WO2020/192849, PCT Pub. Date Oct. 1, 2020.
Claims priority of application No. 10 2019 204 318.6 (DE), filed on Mar. 28, 2019.
Prior Publication US 2022/0174089 A1, Jun. 2, 2022
Int. Cl. G06V 10/764 (2022.01); G06N 3/045 (2023.01); G06N 3/08 (2023.01); G06V 10/44 (2022.01); G06V 10/82 (2022.01); G06V 20/56 (2022.01); H04L 9/40 (2022.01)
CPC G06N 3/08 (2013.01) [G06N 3/045 (2023.01); G06V 10/454 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01); G06V 20/56 (2022.01); H04L 63/1441 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for identifying and classifying adversarial attacks on an automated detection system, comprising:
providing a reference signal and a potentially manipulated signal, wherein each signal includes at least one of an image signal, a video signal, or an audio signal,
calculating a set of n metrics which quantify differences between the reference signal and the potentially manipulated signal in different ways, with n being a natural number greater than one
creating an n-dimensional feature space based on the calculated metrics,
classifying the type of adversarial attack on the basis of the calculated metrics in the n-dimensional feature space, and
outputting the class of the adversarial attack,
wherein the automated detection system comprises at least one trained neural network, and the reference signal and the potentially manipulated signal are provided following completion of a training phase of the at least one neural network, and
wherein subsets are created from the n metrics in order to extract most relevant m metrics, with m being a natural number less than n and, wherein the classification of the type of the adversarial attack is effected on the basis of the calculated metrics in the m-dimensional feature space.