US 12,436,804 B2
	Memory as a service for artificial neural network (ANN) applications
Dmitri Yudanov, Rancho Cordova, CA (US); Ameen D. Akel, Rancho Cordova, CA (US); Samuel E. Bradshaw, Sacramento, CA (US); Kenneth Marion Curewitz, Cameron Park, CA (US); and Sean Stephen Eilert, Penryn, CA (US)
Assigned to Micron Technology, Inc., Boise, ID (US)
Filed by Micron Technology, Inc., Boise, ID (US)
Filed on May 28, 2019, as Appl. No. 16/424,429.
Prior Publication US 2020/0379809 A1, Dec. 3, 2020
Int. Cl. G06F 9/50 (2006.01); G06F 12/02 (2006.01); G06F 12/08 (2016.01); G06F 12/0862 (2016.01); G06F 12/1009 (2016.01); G06F 12/1036 (2016.01); G06F 12/1072 (2016.01); G06F 12/126 (2016.01); G06F 13/16 (2006.01); G06N 3/04 (2023.01)

CPC G06F 9/5016 (2013.01) [G06F 9/5022 (2013.01); G06F 9/5077 (2013.01); G06F 12/023 (2013.01); G06F 12/0284 (2013.01); G06F 12/08 (2013.01); G06F 12/0862 (2013.01); G06F 12/1009 (2013.01); G06F 12/1036 (2013.01); G06F 12/1072 (2013.01); G06F 13/1663 (2013.01); G06N 3/04 (2013.01); G06F 12/126 (2013.01); G06F 2212/1016 (2013.01); G06F 2212/152 (2013.01); G06F 2212/502 (2013.01); G06F 2212/657 (2013.01)]

21 Claims

1. A computing device, the computing device comprising:

local memory; and

a processor that executes instructions from the local memory to configure the processor to perform operations comprising:

executing an application in the computing device, wherein the application is based on evaluating an output of an artificial neural network as the artificial neural network responding to an input, the artificial neural network having a first portion and a second portion, the output generated by the artificial neural network using the first portion and the second portion;

storing the first portion of the artificial neural network in the local memory of the computing device, wherein the second portion of the artificial neural network is stored in memory of a remote device, and the remote device and the computing device are connected via a wired or wireless network connection;

generating, by utilizing the computing device and the remote device, a first prediction of when a degradation of the wired or wireless network connection is to occur;

predicting, for a second prediction, a time to evict the first portion of the artificial neural network from the at least one memory region of the local memory based on a criticality level and the first prediction of when the degradation of the wired or wireless network connection is to occur;

evicting the first portion of the artificial neural network from the at least one memory region based on the second prediction and based on the first prediction of when the degradation of the wired or wireless network connection is to occur;

generating, by the computing device a first output of the first portion of the artificial neural network in response to the input, wherein the second portion of the artificial neural network in the remote device is configured to receive the first output generated by the computing device using the first portion of the artificial neural network as input to generate a second output of the second portion of the artificial neural network;

predicting a usage pattern of the artificial neural network during a time period corresponding to the first prediction of when the degradation of the wired or wireless network connection is to occur;

generating, based on the usage pattern predicted for the artificial neural network, an alternative module of the computing device for use by the artificial neural network;

wherein, when the degradation of the wired or wireless network connection is predicted to occur in accordance with the first prediction, the alternative module of the computing device substitutes the second portion of the artificial neural network in the remote device to generate the second output based on the first output;

allocating, by the computing device, memory resources to the application via virtual memory addresses;

accessing, by the computing device, at least a portion of the memory of the remote device via mapping virtual memory addresses used to provide memory resources to physical memory addresses of the at least the portion of the memory of the remote device; and

generating, in the computing device and based at least in part on the application using the virtual memory addresses allocated to provide the memory resources and the second portion of the artificial neural network, a result corresponding to the second output of the second portion of the artificial neural network.