US 11,734,574 B1
	Neural Bregman divergences for distance learning
Fred Sun Lu, Camas, WA (US); and Edward Simon Paster Raff, Jamesville, NY (US)
Assigned to BOOZ ALLEN HAMILTON INC., McLean, VA (US)
Filed by Booz Allen Hamilton Inc., McLean, VA (US)
Filed on Mar. 8, 2022, as Appl. No. 17/689,185.
Int. Cl. G06N 3/084 (2023.01); G06N 3/048 (2023.01)

CPC G06N 3/084 (2013.01) [G06N 3/048 (2023.01)]

12 Claims

1. A method for configuring a computer for data similarity determination using Bregman divergence, the method comprising:

storing a data set, the data set having plural data pairs with one or more data points corresponding to one or more features, wherein a first given feature of a first piece of data in a first data pair has a known target distance to a second given feature of a second piece of data in the first data pair; and

training an input convex neural network (ICNN) using the data set, the ICNN having one or more parameters, wherein training the ICNN includes:

for each data pair within the data set:

extracting one or more features for each piece of data in the first data pair;

generating an empirical Bregman divergence for the data pair; and

computing one or more gradients between the one or more features within the first data pair based on the known target distance between the one or more features of the first data pair and the empirical Bregman divergence, the one or more gradients being computed using double backpropagation, automatic differentiation to compute the one or more gradients with respect to one or more data inputs, and a dot-product between the one or more gradients and another value;

generating a trained ICNN configured to output an arbitrary Bregman divergence function within a space of all possible Bregman divergences for a data pair based on the one or more gradients;

receiving a data file, the data file having one or more features;

inputting the data file into the trained ICNN;

generating a Bregman function for each of the one or more features of the data file, the one or more features including at least one of curvature and angularity;

calculating a distance between the one or more features of the data file and the one or more data points of the plural data pairs; and

outputting a classification of the data file based on the calculated distance.