US 11,941,086 B2
Systems and methods for contrastive attention-supervised tuning
Ramprasaath Ramasamy Selvaraju, Atlanta, GA (US); and Nikhil Naik, Mountain View, CA (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by Salesforce, Inc., San Francisco, CA (US)
Filed on Mar. 22, 2021, as Appl. No. 17/209,011.
Claims priority of provisional application 63/114,484, filed on Nov. 16, 2020.
Prior Publication US 2022/0156527 A1, May 19, 2022
Int. Cl. G06F 18/21 (2023.01); G06F 17/16 (2006.01); G06F 18/214 (2023.01); G06N 3/045 (2023.01); G06N 3/084 (2023.01); G06N 3/10 (2006.01); G06T 7/194 (2017.01); G06V 10/25 (2022.01); G06V 10/46 (2022.01)
CPC G06F 18/2193 (2023.01) [G06F 17/16 (2013.01); G06F 18/214 (2023.01); G06N 3/045 (2023.01); G06N 3/084 (2013.01); G06N 3/10 (2013.01); G06T 7/194 (2017.01); G06V 10/25 (2022.01); G06V 10/462 (2022.01); G06T 2207/20084 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method for saliency-constrained random cropping of a training image, the method comprising:
receiving a training image sample for training a contrastive learning network;
generating, from the training image sample, a saliency map in a form of a binary mask indicating a plurality of salient regions in the training image sample;
generating a first random crop and a second random crop of the training image sample, both subject to a crop constraint that each of the first random crop and the second random crop overlaps with one or more salient regions in the saliency map for more than an area-overlap threshold;
sending the first random crop as a key and the second random crop as a query to the contrastive learning network;
computing a contrastive loss from the first random crop corresponding to the key and the second random crop corresponding to the query from the contrastive learning network; and
updating the contrastive learning network based at least in part on the contrastive loss.