US 11,886,963 B2
	Optimizing machine learning models
Matthew Welsh, Seattle, WA (US); Jason Knight, San Diego, CA (US); Jared Roesch, Seattle, WA (US); Thierry Moreau, Seattle, WA (US); Adelbert Chang, Cupertino, CA (US); Tianqi Chen, Pittsburgh, PA (US); Luis Henrique Ceze, Seattle, WA (US); An Wang, Seattle, WA (US); Michal Piszczek, Seattle, WA (US); Andrew McHarg, Kirkland, WA (US); and Fletcher Haynes, Redmond, WA (US)
Assigned to OctoML, Inc., Seattle, WA (US)
Filed by OctoML, Inc., Seattle, WA (US)
Filed on Feb. 23, 2021, as Appl. No. 17/183,075.
Claims priority of provisional application 63/120,017, filed on Dec. 1, 2020.
Prior Publication US 2022/0172110 A1, Jun. 2, 2022
Int. Cl. G06N 20/00 (2019.01); G06F 8/35 (2018.01); G06F 18/214 (2023.01); G06F 18/21 (2023.01); G06N 3/082 (2023.01); G06N 3/063 (2023.01)

CPC G06N 20/00 (2019.01) [G06F 8/35 (2013.01); G06F 18/214 (2023.01); G06F 18/217 (2023.01)]

19 Claims

1. One or more instances of computer-readable media collectively having contents configured to cause a computing device to obtain a plurality of machine learning models and optimize the plurality of machine learning models, none of the instances of computer-readable media constituting a transitory propagating signal per se, the method comprising:

obtaining a plurality of machine learning models, including obtaining a description of each machine learning model of the plurality of machine learning models; and

for each machine learning model of the plurality of machine learning models:

identifying a hardware target of the machine learning model, wherein the hardware target includes an indication of hardware upon which the machine learning model is to be deployed;

identifying a model type of the machine learning model based on the description of the machine learning model;

retrieving optimization result data from a repository of optimization result data based on the identified hardware target and the model type, the retrieved optimization result data reflecting a level of consumption of hardware resources resulting from changing one or more machine learning models;

changing the machine learning model for optimized operation on hardware specified by the identified hardware target, based on the hardware target and the retrieved optimization result data, wherein changing the machine learning model comprises changing one or more of:

software code associated with one or more operators used by the machine learning model; or

software code associated with one or more partitions of the machine learning model;

obtaining additional optimization result data by evaluating the performance of the optimized machine learning model on hardware indicated by the hardware target, the additional optimization result data including an indication of the hardware target and the model type; and

storing the additional optimization result data within the repository of optimization result data.