US 12,333,438 B1
Resource-efficient techniques for repeated hyper-parameter optimization
Giovanni Zappella, Berlin (DE); Cedric Philippe Archambeau, Berlin (DE); and David Salinas, Meylan (FR)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jun. 30, 2021, as Appl. No. 17/364,775.
Int. Cl. G06N 3/082 (2023.01); G06F 18/214 (2023.01); G06F 18/23 (2023.01); G06N 3/045 (2023.01)
CPC G06N 3/082 (2013.01) [G06F 18/214 (2023.01); G06F 18/23 (2023.01); G06N 3/045 (2023.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
one or more computing devices;
wherein the one or more computing devices include instructions that upon execution on or across the one or more computing devices cause the one or more computing devices to:
obtain an indication that respective sets of hyper-parameter combination (HPC) analysis experiments are to be conducted for a plurality of related machine learning tasks, wherein hyper-parameter search spaces of individual ones of the related machine learning tasks overlap at least partly with each other;
select, using a first set of HPC analysis experiments, a first recommended HPC for a first machine learning task of the plurality of related machine learning tasks;
include the first recommended HPC in a collection of candidate HPCs to be analyzed for a second machine learning task of the plurality of related machine learning tasks;
conduct, using the collection of candidate HPCs, a second set of HPC analysis experiments for the second machine learning task, wherein the second set comprises a plurality of analysis iterations, and wherein a particular analysis iteration of the plurality of analysis iterations comprises:
performing the second machine learning task using a first iteration-specific set of HPCs, wherein the first iteration-specific set includes (a) the first recommended HPC and (b) one or more other members of the collection;
assigning respective rankings to individual members of the first iteration-specific set of HPCs based at least in part on respective loss function values obtained as a result of the performing;
classifying one or more HPCs from the first iteration-specific set as suitable-for-future-iterations, based at least in part on a comparison of (a) respective rankings assigned to the one or more HPCs and (b) a ranking assigned to the first recommended HPC; and
generating, from the first iteration-specific set, a second iteration-specific set of HPCs for a subsequent analysis iteration of the plurality of analysis iterations, wherein said generating includes pruning, from the first iteration-specific set, one or more hyper-parameter combinations which are not classified as suitable-for-future-iterations; and
select a second recommended HPC for the second machine learning task based at least in part on loss function values computed in the plurality of analysis iterations; and
store an indication of (a) the second recommended HPC and (b) a result of execution of the second machine learning task using the second recommended HPC.