US 12,260,236 B1
	Machine learning model replacement on edge devices
Chao Zhou, Fremont, CA (US); Maxwell Edward Chapman Nuyens, Redwood City, CA (US); and Ravish Hastantram, Fremont, CA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 16, 2022, as Appl. No. 18/067,171.
Int. Cl. G06F 9/455 (2018.01)

CPC G06F 9/45508 (2013.01)

20 Claims

1. A computer-implemented method comprising:

receiving a request to load a second machine learning (ML) model onto an edge device while a first ML model has already been loaded into memory of the edge device, wherein the second ML model and the first ML model share an external handle but have different internal aliases;

in response to the request, the edge device executing edge compute code using one or more processors to load at least one instance of the second ML model into the memory of the edge device while the first ML model is still loaded into the memory of the edge device;

after the second ML model has been loaded into the memory of the edge device, receiving a prediction request at the shared external handle of the first ML model and the second ML model;

in response to the prediction request, the edge device executing the edge compute code using the one or more processors to perform a handle-to-alias translation to determine to direct the prediction request to the second ML model instead of the first ML model;

directing the prediction request to the second ML model instead of the first ML model; and

after directing the prediction request to the second ML model instead of the first ML model, unloading the first ML model from the memory of the edge device.