US 11,809,519 B2
Semantic input sampling for explanation (SISE) of convolutional neural networks
Jongseong Jang, Toronto (CA); Hyunwoo Kim, Seoul (KR); YeonJeong Jeong, Toronto (CA); SangMin Lee, Seoul (KR); Sam Sattarzadeh, Richmond Hill (CA); Mahesh Sudhakar, Toronto (CA); Shervin Mehryar, Toronto (CA); Anthony Lem, Calgary (CA); and Konstantinos Plataniotis, Toronto (CA)
Assigned to LG ELECTRONICS INC., Seoul (KR); and The Governing Council of the University of Toronto, Toronto (CA)
Filed by LG ELECTRONICS INC., Seoul (KR); and The Governing Council of the University of Toronto, Toronto (CA)
Filed on Aug. 18, 2021, as Appl. No. 17/405,935.
Claims priority of application No. 10-2020-0103902 (KR), filed on Aug. 19, 2020.
Prior Publication US 2022/0058431 A1, Feb. 24, 2022
Int. Cl. G06F 18/213 (2023.01); G06N 3/04 (2023.01); G06V 10/28 (2022.01); G06V 10/32 (2022.01); G06F 18/25 (2023.01); G06F 18/2113 (2023.01)
CPC G06F 18/213 (2023.01) [G06F 18/2113 (2023.01); G06F 18/253 (2023.01); G06N 3/04 (2013.01); G06V 10/28 (2022.01); G06V 10/32 (2022.01)] 15 Claims
OG exemplary drawing
 
1. A method for outputting an explanation map for an output determination of a convolutional neural network (CNN) based on an input image, the method comprising:
extracting a plurality of sets of feature maps from a corresponding plurality of pooling layers of the CNN;
obtaining a plurality of attribution masks based on subsets of the plurality of sets of feature maps;
applying the plurality of attribution masks to copies of the input image to obtain a plurality of perturbed input images;
obtaining a plurality of visualization maps based on confidence scores by inputting the plurality of perturbed copies of the input image to the CNN; and
outputting an explanation map of the output determination of the CNN based on the plurality of visualization maps,
wherein outputting the explanation map comprises performing a fusion process to combine feature information from the plurality of visualization maps, and
wherein the fusion process to combine feature information of visualization maps of the plurality of visualization maps comprises:
normalizing a first visualization map of the plurality of visualization maps;
performing unweighted addition of the normalized first visualization map and a normalized second visualization map to obtain a first result;
performing Otsu-based binarization on the normalized second visualization map to eliminate features which are not present in the normalized first visualization map to obtain a second result;
performing point-wise multiplication on the first result and the second result to obtain a third result; and
performing the fusion process using the third result and a next visualization map of the plurality of visualization maps.