US 12,093,553 B2
	Method, device, and computer program product for managing machine learning model
Jinpeng Liu, Shanghai (CN); Jiacheng Ni, Shanghai (CN); Qiang Chen, Shanghai (CN); Danqing Sha, Shanghai (CN); and Zhen Jia, Shanghai (CN)
Assigned to EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed by EMC IP Holding Company LLC, Hopkinton, MA (US)
Filed on May 14, 2021, as Appl. No. 17/320,392.
Claims priority of application No. 202110442312.8 (CN), filed on Apr. 23, 2021.
Prior Publication US 2022/0343209 A1, Oct. 27, 2022
Int. Cl. G06F 3/06 (2006.01); G06F 8/65 (2018.01); G06N 20/00 (2019.01); G06F 12/02 (2006.01); G06F 12/0862 (2016.01); G06Q 10/0631 (2023.01)

CPC G06F 3/0647 (2013.01) [G06F 3/067 (2013.01); G06F 3/0679 (2013.01); G06F 8/65 (2013.01); G06N 20/00 (2019.01); G06F 12/0261 (2013.01); G06F 12/0862 (2013.01); G06Q 10/06316 (2013.01)]

20 Claims

1. A method for managing a machine learning model, comprising:

determining a first instance of a current version for the machine learning model and a second instance of an upgraded version for the machine learning model, the first instance executing a service for processing data, wherein the first instance and the second instance are configured to run at least in part concurrently with one another on one or more graphics processing units to provide uninterrupted access to the service for processing data using one of the first instance and the second instance in conjunction with migration of the service from the first instance to the second instance;

adjusting respectively, if determining that the service is to be migrated from the first instance to the second instance, a first allocation policy for storage space of the first instance and a second allocation policy for storage space of the second instance to a first target policy and a second target policy, wherein the first target policy is used to phase out storage space and the second target policy is used to phase in storage space;

reclaiming allocated storage space for the first instance based on the first target policy; and

allocating required storage space for the second instance based on the second target policy to realize migration of the service;

wherein the storage space comprises memory resources of the one or more graphics processing units;

wherein the first target policy is more restrictive with regard to usage of the memory resources of the one or more graphics processing units than the first allocation policy; and

wherein the second target policy is less restrictive with regard to usage of the memory resources of the one or more graphics processing units than the first target policy.