US 11,853,436 B2
Protecting cognitive systems from model stealing attacks
Taesung Lee, Ridgefield, CT (US); Ian M. Molloy, Chappaqua, NY (US); and Dong Su, Sunnyvale, CA (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 15, 2021, as Appl. No. 17/231,369.
Application 17/231,369 is a continuation of application No. 15/714,514, filed on Sep. 25, 2017, granted, now 11,023,593.
Prior Publication US 2021/0303703 A1, Sep. 30, 2021
This patent is subject to a terminal disclaimer.
Int. Cl. G06N 3/04 (2023.01); G06N 3/08 (2023.01); G06F 21/60 (2013.01); G06Q 10/06 (2023.01); G06N 3/082 (2023.01)
CPC G06F 21/602 (2013.01) [G06N 3/04 (2013.01); G06N 3/08 (2013.01); G06N 3/082 (2013.01); G06Q 10/06 (2013.01); H04L 2209/16 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A method for obfuscating training of trained cognitive model logic, the method being performed in a data processing system comprising a processor and a memory, the memory comprising instructions executed by the processor to specifically configure the processor to implement a cognitive system comprising the trained cognitive model logic, the method comprising:
receiving, by the trained cognitive model logic of the cognitive system, input data for classification into at least one class, of a plurality of predefined classes, as part of a cognitive operation of the cognitive system;
processing, by the trained cognitive model logic, the input data by applying a trained cognitive model to the input data to generate an output vector having values for each of the plurality of predefined classes;
modifying, by a perturbation insertion engine of the cognitive system, one or more values of the output vector by inserting a perturbation in a function associated with generating the output vector, to thereby generate a modified output vector; and
outputting, by the trained cognitive model logic, the modified output vector, wherein the perturbation modifies the one or more values to obfuscate the trained configuration of the trained cognitive model logic, wherein modifying the one or more values of the output vector by inserting the perturbation in the function associated with generating the output vector comprises adding noise to an output of the function that modifies output values associated with non-boundary case output values to obfuscate the non-boundary case output values, but has minimal change in the modified output vector compared to the output vector generated by the function without insertion of the perturbation for boundary case output values.