US 12,254,390 B2
Wasserstein barycenter model ensembling
Youssef Mroueh, Yorktown Heights, NY (US); Pierre L. Dognin, Yorktown Heights, NY (US); Igor Melnyk, Yorktown Heights, NY (US); Jarret Ross, Yorktown Heights, NY (US); Tom Sercu, Yorktown Heights, NY (US); and Cicero Nogueira Dos Santos, Yorktown Heights, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by International Business Machines Corporation, Armonk, NY (US)
Filed on Apr. 29, 2019, as Appl. No. 16/397,008.
Prior Publication US 2020/0342361 A1, Oct. 29, 2020
Int. Cl. G06N 20/20 (2019.01); G06F 16/25 (2019.01); G06F 18/22 (2023.01); G06F 18/24 (2023.01); G06V 10/764 (2022.01); G06V 10/80 (2022.01)
CPC G06N 20/20 (2019.01) [G06F 16/25 (2019.01); G06F 18/22 (2023.01); G06F 18/24 (2023.01); G06V 10/764 (2022.01); G06V 10/80 (2022.01)] 20 Claims
OG exemplary drawing
 
1. A method of ensembling, comprising:
inputting a set of models that predict different sets of attributes;
determining a source set of attributes and a target set of attributes using a Wasserstein barycenter with an optimal transport metric that includes a Wasserstein distance and is based on a cost matrix defining pairwise distances between semantic classes;
inputting side information into the Wasserstein barycenter, wherein the side information includes class relationships represented by an embedding space;
determining a consensus among the set of models whose predictions are defined on the source set of attributes; and
training a neural network with the consensus.