CPC G06N 3/08 (2013.01) [G06F 18/10 (2023.01); G06F 18/2148 (2023.01); G06F 18/23 (2023.01); G06F 18/24137 (2023.01)] | 20 Claims |
1. A computer-implemented training method for training a classifier (Φη), wherein:
a transformed sample being a sample obtained by applying a transformation (T) to a source sample, where the source sample is a datum of a source dataset (SD);
the training method comprises:
S10) training a pretext model (ΦΘ) to learn a pretext task, based on a source dataset (SD), by using a first training criterion which tends to minimize, across the source samples of the source dataset, a distance between an output of a source sample via the pretext model (ΦΘ) and an output of a corresponding transformed sample via the pretext model (ΦΘ);
S20) for at least one sample among the samples (Xi) of the source dataset (SD), determining a neighborhood (NXi) of the at least one sample;
wherein for the at least one sample, the neighborhood (NXi) of the at least one sample comprises K closest neighbors of the sample, K being an integer, K>=1, the K closest neighbors of the sample being K samples Xj of the dataset having smallest distances between ΦΘ(Xi) and ΦΘ(Xj);
S30) training the classifier Φη to predict respective estimated probabilities Φηj(Xi), j=1 . . . C, for a sample to belong to respective clusters (Cj), by using a second training criterion which:
tends to maximize a likelihood for a sample and a neighbor (Xj) of the sample belonging to the neighborhood (NXi) of the sample to belong to the same cluster; and
tends to force the samples to be distributed over a plurality of clusters;
the second training criterion includes a summation:
![]() where
f is an increasing continuous function, for instance a logarithm;
<, > is a dot product;
D is a dataset used for training the classifier at step S30; and
|D| is the number of samples in the dataset.
|