CPC G06N 20/10 (2019.01) [G06F 17/142 (2013.01); G06F 17/18 (2013.01); G06F 18/2431 (2023.01); G06F 40/20 (2020.01); G06N 3/08 (2013.01); G06N 3/084 (2013.01); G06N 20/00 (2019.01); G06V 10/7715 (2022.01)] | 20 Claims |
1. A computer-implemented method of training a machine-learned model for classifying inputs into one or more classes of a plurality of classes, each of the plurality of classes having an associated class embedding in a plurality of class embeddings, the method comprising:
receiving, by one or more computing devices, one or more inputs and one or more labels;
determining, by the one or more computing devices, one or more input embeddings associated with the one or more inputs;
normalizing, by the one or more computing devices using a vector norm, the one or more input embeddings to obtain one or more normalized input embeddings;
normalizing, by the one or more computing devices using the vector norm, the plurality of class embeddings to obtain a plurality of normalized class embeddings;
selecting, by the one or more computing devices, one or more negative classes from the plurality of classes based at least in part on a probability distribution approximating a softmax distribution, wherein the probability distribution comprises a linearized kernel determined based at least in part on a Random Fourier Features map, wherein the linearized kernel provides a uniform multiplicative approximation of an exponential kernel associated with the softmax distribution, and wherein the probability distribution is a function of the one or more normalized input embeddings and the plurality of normalized class embeddings;
evaluating, by the one or more computing devices, a loss function to determine a loss based at least in part on the one or more negative classes, the one or more inputs, and the one or more labels; and
adjusting, by the one or more computing devices, one or more parameters of the machine-learned model based at least in part on the loss, the one or more inputs, and the one or more labels.
|