US 12,333,773 B2
Explaining a model output of a trained model
Bart Jacob Bakker, Eindhoven (NL); Dimitrios Mavroeidis, Utrecht (NL); and Stojan Trajanovski, London (GB)
Assigned to KONINKLIJKE PHILIPS N.V., Eindhoven (NL)
Appl. No. 17/797,775
Filed by KONINKLIJKE PHILIPS N.V., Eindhoven (NL)
PCT Filed Feb. 7, 2021, PCT No. PCT/EP2021/052895
§ 371(c)(1), (2) Date Aug. 5, 2022,
PCT Pub. No. WO2021/160539, PCT Pub. Date Aug. 19, 2021.
Claims priority of application No. 20156426 (EP), filed on Feb. 10, 2020.
Prior Publication US 2023/0052145 A1, Feb. 16, 2023
Int. Cl. G06V 10/25 (2022.01); G06N 3/08 (2023.01); G06V 10/44 (2022.01); G06V 10/46 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01)
CPC G06V 10/25 (2022.01) [G06N 3/08 (2013.01); G06V 10/454 (2022.01); G06V 10/464 (2022.01); G06V 10/764 (2022.01); G06V 10/82 (2022.01)] 14 Claims
OG exemplary drawing
 
1. A computer-implemented method of generating explainability information for explaining a model output of a trained model being a neural-network type model, the method comprising:
accessing:
a trained model configured to determine a model output for an input instance, the trained model comprising at least a source layer, the source layer being an input layer or an internal layer of the trained model;
one or more aspect recognition models for respective characteristics of input instances of the trained model, an aspect recognition model for a given characteristic being configured to indicate a presence of the characteristic in an input instance, the aspect recognition model comprising at least a target layer, the target layer being an input layer or an internal layer of the aspect recognition model;
obtaining an input instance;
applying the trained model to the input instance to obtain a model output, said applying comprising obtaining a source representation of the input instance at the source layer of the trained model;
applying a saliency method to obtain, at the source layer, a masked source representation of the input instance of the trained model, the masked source representation comprising elements of the source representation relevant to the model output;
for an aspect recognition model for a characteristic:
mapping the masked source representation to the target layer of the aspect recognition model to obtain a target representation for the input instance at the target layer;
applying the aspect recognition model for the characteristic to the target representation to obtain a model output indicating a presence of the characteristic relevant to the model output of the trained model;
outputting, as the explainability information, the characteristics indicated to be present by the applied aspect recognition models.