US 12,456,034 B2
Image classification explanation by generating boundary crossing examples with removed features via filter suppression
Andres Mauricio Munoz Delgado, Weil Der Stadt (DE)
Assigned to ROBERT BOSCH GMBH, Stuttgart (DE)
Filed by Robert Bosch GmbH, Stuttgart (DE)
Filed on Apr. 13, 2021, as Appl. No. 17/229,275.
Claims priority of application No. 20170305 (EP), filed on Apr. 20, 2020.
Prior Publication US 2021/0326661 A1, Oct. 21, 2021
Int. Cl. G06N 3/045 (2023.01); G06N 3/0455 (2023.01); G06N 3/0464 (2023.01); G06N 3/0475 (2023.01); G06N 3/094 (2023.01); G06N 5/045 (2023.01)
CPC G06N 3/045 (2023.01) [G06N 3/0455 (2023.01); G06N 3/0464 (2023.01); G06N 3/0475 (2023.01); G06N 3/094 (2023.01); G06N 5/045 (2013.01)] 13 Claims
OG exemplary drawing
 
1. A computer-implemented method of determining a classification explanation for a trained classifier, the classification explanation being for one or more classifier inputs classified by the trained classifier into a same class, the method comprising the following steps:
accessing model data defining the trained classifier and model data defining a generative model, the generative model being configured to generate a classifier input for the trained classifier from a generator input, the generative model including multiple filters, each filter of the generative model being configured to generate a filter output at an internal layer of the generative model;
obtaining generator inputs corresponding to the one or more classifier inputs, each generator input causing the generative model to approximately generate the corresponding classifier input;
determining filter suppression factors for the multiple filters of the generative model, each filter suppression factor for each filter indicating a degree of suppression for the filter output of the filter, the filter suppression factors being determined based on an effect of adapting the classifier inputs according to the filter suppression factors on the classification by the trained classifier, the determining including:
adapting each classifier input according to one or more of the filter suppression factors by applying the generative model to the generator input corresponding to the classifier input, while modulating the filter outputs of the filters of the generative model according to the one or more filter suppression factors, and
applying the trained classifier to each adapted classifier input to obtain a classifier output affected by the one or more filter suppression factors;
determining the classification explanation in terms of the filter suppression factors such that the classification explanation excludes any image representation and comprises a set of the filter suppression factors, and outputting the classification explanation, wherein:
the trained classifier is an image classifier, and
the classifier input includes an image of a product produced in a manufacturing process;
controlling the manufacturing process based on the classification of the classification explanation; and
determining the filter suppression factors by performing an optimization configured to: (i) minimize a difference between a target classifier output and affected classifier outputs of the trained classifier for the one or more classifier inputs affected by the filter suppression factors, and (ii) minimize an overall degree of suppression indicated by the filter suppression factors.