US 12,443,844 B2
Neural network trained using ordinal loss function
Robert Vanderheyden, Acworth, GA (US); William Chamberlin, Palatine, IL (US); and John Handy Bosma, Leander, TX (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Oct. 25, 2021, as Appl. No. 17/509,840.
Prior Publication US 2023/0132127 A1, Apr. 27, 2023
Int. Cl. G06N 3/08 (2023.01); G06F 18/2113 (2023.01); G06F 18/2137 (2023.01); G06F 18/22 (2023.01); G06F 18/2431 (2023.01)
CPC G06N 3/08 (2013.01) [G06F 18/2113 (2023.01); G06F 18/2137 (2023.01); G06F 18/22 (2023.01); G06F 18/2431 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented process for training an ordinal mapping deep neural network, the computer-implemented process comprising:
receiving a plurality of samples, wherein each sample is a computer-processable data structure corresponding to a real-world object and includes a data element indicating a class of each sample, wherein the class is one of n predefined classes to which each sample is linked;
feeding each sample into an ordinal mapping deep neural network that maps each sample to a sample point of a multidimensional space,
wherein a sample, of the plurality of samples, comprises factual statements identified within text, and
wherein the ordinal mapping deep neural network ranks the factual statements identified within the text;
predicting the class based on an ordinal mapping of each sample point by the ordinal mapping deep neural network; and
iteratively adjusting parameters of the ordinal mapping deep neural network in response to misclassifying one or more samples, of the plurality of samples, by the ordinal mapping deep neural network, wherein the iteratively adjusting is based on an expected ordinal mapping loss determined by an ordinal mapping loss function that measures (a) distances between a hyperplane extending through each sample point in the multidimensional space and each other sample point of a same class and (b) overlap between sample points of different classes.