US 12,462,189 B2
Robust and data-efficient blackbox optimization
Krzysztof Choromanski, Lincroft, NJ (US); Vikas Sindhwani, Mt. Kisco, NY (US); and Aldo Pacchiano Camacho, Berkeley, CA (US)
Appl. No. 17/423,601
Filed by Google LLC, Mountain View, CA (US)
PCT Filed Dec. 16, 2019, PCT No. PCT/US2019/066547
§ 371(c)(1), (2) Date Jul. 16, 2021,
PCT Pub. No. WO2020/149971, PCT Pub. Date Jul. 23, 2020.
Claims priority of provisional application 62/793,248, filed on Jan. 16, 2019.
Prior Publication US 2022/0108215 A1, Apr. 7, 2022
Int. Cl. G06N 20/00 (2019.01); G06F 17/11 (2006.01)
CPC G06N 20/00 (2019.01) [G06F 17/11 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A computer-implemented method, comprising:
obtaining, by one or more computing devices, data descriptive of current values of a plurality of parameters of a machine-learned model; and
for at least one of one or more iterations:
identifying, by the one or more computing devices, one or more previously evaluated perturbations based at least in part on the current values of the plurality of parameters of the machine-learned model;
sampling, by the one or more computing devices, a plurality of perturbations to the current values of the plurality of parameters of the machine-learned model from a non-orthogonal sampling distribution, the plurality of perturbations including the one or more previously evaluated perturbations;
determining, by the one or more computing devices, a plurality of performance values respectively for the plurality of perturbations, wherein the performance value for each perturbation is generated through evaluation, by a performance evaluation function, of a performance of the machine-learned model with the current values of its parameters perturbed according to the perturbation, wherein determining the plurality of performance values comprises re-using one or more previously evaluated performance values respectively for the one or more previously evaluated perturbations;
performing, by the one or more computing devices, a regression with respect to the plurality of perturbations and the plurality of performance values to estimate a gradient of the performance evaluation function; and
modifying, by the one or more computing devices, the current value of at least one of the plurality of parameters of the machine-learned model based at least in part on the estimated gradient of the performance evaluation function; and
after the one or more iterations, providing, by the one or more computing devices, final values of the plurality of parameters of the machine-learned model as an output,
wherein the one or more previously evaluated perturbations are included within a trust region associated with the current values of the plurality of parameters, and
wherein identifying, by the one or more computing devices, the one or more previously evaluated perturbations that are included within the trust region comprises identifying, by the one or more computing devices, any previously evaluated perturbations that are within a radius from the current values of the plurality of parameters.