US 12,406,210 B2
	Techniques for machine learning model selection for domain generalization
Devansh Arpit, Pacifica, CA (US); Huan Wang, Palo Alto, CA (US); Yingbo Zhou, Palo Alto, CA (US); and Caiming Xiong, Palo Alto, CA (US)
Assigned to Salesforce, Inc., San Francisco, CA (US)
Filed by Salesforce, Inc., San Francisco, CA (US)
Filed on May 16, 2022, as Appl. No. 17/663,595.
Prior Publication US 2023/0368078 A1, Nov. 16, 2023
Int. Cl. G06N 20/20 (2019.01); G06F 18/20 (2023.01); G06F 18/21 (2023.01)

CPC G06N 20/20 (2019.01) [G06F 18/217 (2023.01); G06F 18/285 (2023.01)]

20 Claims

1. A method for machine learning model training, comprising:

performing training of a plurality of machine learning models on a first data set associated with a first domain, wherein the plurality of machine learning models comprises respective sets of parameters that are updated across a plurality of iterations during the training, wherein the training comprises, for each machine learning model of the plurality of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of the plurality of iterations, a moving average of the set of parameters calculated over a threshold number of previous iterations;

selecting a plurality of model states that are generated during the training of the plurality of machine learning models, wherein the plurality of model states are selected based at least in part on a validation performance of the plurality of model states performed during the training;

generating an ensembled machine learning model by aggregating the plurality of machine learning models corresponding to the plurality of selected model states; and

performing a machine learning prediction using the ensembled machine learning model on a second data set associated with a second domain different from the first domain, wherein an output of the ensembled machine learning model is a dimension-wise average of respective outputs from the plurality of machine learning models in the ensembled machine learning model.