US 11,875,586 B2
	Detection and mitigation of cyber attacks on binary image recognition systems
Eric Balkanski, New York, NY (US); Harrison Chase, San Francisco, CA (US); Kojin Oshiba, San Francisco, CA (US); Alexander Rilee, Cambridge, MA (US); Yaron Singer, Menlo Park, CA (US); and Richard Wang, West Hills, CA (US)
Assigned to ROBUST INTELLIGENCE, INC., San Francisco, CA (US)
Filed by Robust Intelligence, Inc., San Francisco, CA (US)
Filed on Feb. 5, 2021, as Appl. No. 17/168,547.
Claims priority of provisional application 62/971,021, filed on Feb. 6, 2020.
Prior Publication US 2021/0248241 A1, Aug. 12, 2021
Int. Cl. G06F 21/57 (2013.01); G06F 18/2433 (2023.01); G06V 30/40 (2022.01); G06N 20/00 (2019.01); G06V 30/20 (2022.01); G06V 10/74 (2022.01); G06V 30/10 (2022.01)

CPC G06V 30/40 (2022.01) [G06F 18/2433 (2023.01); G06F 21/577 (2013.01); G06N 20/00 (2019.01); G06V 10/761 (2022.01); G06V 30/20 (2022.01); G06F 2221/034 (2013.01); G06V 30/10 (2022.01)]

14 Claims

1. A computer-implemented method for detecting vulnerabilities of a model for binary image classification, comprising:

receiving, by a computer system, a binary image data, the computer system configured to detect a pixel value in the binary image data to represent a non-machine language value related to the binary image data;

determining, by the computer system, that the binary image data further comprises at least a pixel value that is altered in a manner to change the non-machine language value related to the binary image data when read by an image recognition system; and

alerting, by the computer system, to the image recognition system to review the binary image data,

wherein the image recognition system includes:

a first artificial intelligence model that classifies a first portion of the binary image data that represents a numerical amount written in numbers; and

a second artificial intelligence model that classifies a second portion of the binary image data that represents the numerical amount written in letters;

wherein said determining includes determining that the first and second artificial intelligence models are attacked simultaneously such that the changed non-machine language value associated with the first portion matches the changed non-machine language value associated with the second portion when read by the image recognition system, and

wherein said determining that the first and second artificial intelligence models are attacked simultaneously comprises determining that an untargeted attack using a shaded combinatorial attack on recognition systems is used on at least one of the first and second artificial intelligence models, wherein the shaded combinatorial attack includes one or more iterations, at least one of the iterations including evaluating gains of one or more pixels of the binary image data based upon spatial and temporal correlations among the gains of the pixels across the iterations.