CPC G06N 3/08 (2013.01) [G06N 5/04 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
providing an input image to a convolutional neural network (CNN), wherein the input image is associated with an image-level label;
obtaining, from the CNN, a first classification probability distribution associated with the input image;
determining, based at least in part on the first classification probability distribution and the image-level label of the input image, a classification loss;
determining an attention map associated with the input image;
applying a thresholding operation to the attention map to obtain a soft mask;
applying the soft mask to the input image to obtain a masked image;
providing the masked image as an input to the CNN;
obtaining, from the CNN, a second classification probability distribution associated with the masked image;
determining an attention mining loss associated with the attention map based at least in part on the first classification probability distribution and the second classification probability distribution; and
utilizing the classification loss and the attention mining loss to self-guide training of the CNN.
|