CPC G06F 9/5005 (2013.01) [G06F 9/3009 (2013.01); G06F 9/5011 (2013.01); G06F 9/505 (2013.01); G06F 11/3409 (2013.01); G06F 18/214 (2023.01); G06F 18/24155 (2023.01); G06N 20/00 (2019.01)] | 21 Claims |
1. An orchestration platform, comprising:
a processor; and
a memory unit operatively connected to the processor including computer code that when executed causes the processor to:
apply a Bayesian optimization technique to a plurality of computing resource configurations used to perform machine learning model training jobs in training a machine learning model, wherein output of the Bayesian optimization technique determines a computing resource configuration of the plurality of computing resource configurations;
execute a subset of the machine learning training jobs using the computing resource configuration of the plurality of computing resource configurations;
repeatedly select additional subsets of the plurality of computing resource configurations and execute the additional subsets of the machine learning training jobs using the additional subsets of the computing resource configurations until a stopping criterion is met for a model hyperparameter associated with the machine learning model; and
select one of the plurality of computing resource configurations from the additional subsets of the plurality of computing resource configurations, wherein the selection of the one of the plurality of computing resource configurations corresponds with a value of the model hyperparameter that is within a desired time within which the additional subsets of the computing resource configurations is completed.
|