US 11,720,408 B2
	Method and system for assigning a virtual machine in virtual GPU enabled systems
Hari Sivaraman, Livermore, CA (US); Uday Pundalik Kurkure, Los Altos Hills, CA (US); Lan Vu, San Jose, CA (US); and Anshuj Garg, Jabalpur (IN)
Assigned to VMWARE, INC., Palo Alto, CA (US)
Filed by VMWARE, INC., Palo Alto, CA (US)
Filed on Apr. 24, 2019, as Appl. No. 16/392,668.
Claims priority of provisional application 62/668,470, filed on May 8, 2018.
Claims priority of application No. 201944003007 (IN), filed on Jan. 24, 2019.
Prior Publication US 2019/0347137 A1, Nov. 14, 2019
Int. Cl. G06F 9/50 (2006.01); G06F 9/455 (2018.01); G06T 1/20 (2006.01); G06F 17/15 (2006.01); G06N 20/00 (2019.01)

CPC G06F 9/5044 (2013.01) [G06F 9/45558 (2013.01); G06F 17/15 (2013.01); G06N 20/00 (2019.01); G06T 1/20 (2013.01); G06F 2009/4557 (2013.01)]

20 Claims

1. A computer-implemented method, comprising:

identifying, by a scheduler service executed by at least one processor, a predetermined set of assignment algorithms;

modifying, by the scheduler service, at least one of the predetermined set of assignment algorithms to generate a plurality of trained assignment algorithms that are trained to maximize a cost function comprising: a ratio of an average execution time for a plurality of virtual machines, and an average total time corresponding to execution time and run queue wait time for the plurality of virtual machines;

generating, by the scheduler service, a data structure that correlates, using the cost function, a particular one of the plurality of trained assignment algorithms with a plurality of graphics configuration parameters;

identifying, by the scheduler service, a virtual machine that is assigned a virtual graphics processing unit (vGPU) profile from an arrival queue;

identifying, by the scheduler service, a graphics configuration of a system comprising a plurality of host computers, the graphics configuration specifying a total number of vGPU-enabled graphics processing units (GPUs) installed in the plurality of host computers in the system and a virtual machine arrival rate for the arrival queue of the system;

determining that an existing run queue of a vGPU-enabled GPU of the system matches the vGPU profile of the virtual machine;

receiving, by the scheduler service, data specifying a plurality of pre-existing virtual machines in the existing run queue of the vGPU-enabled GPU of the system;

selecting, by the scheduler service, the particular one of the trained assignment algorithms that is correlated, in the data structure, with the total number of vGPU-enabled GPUs and the virtual machine arrival rate specified by the graphics configuration of the system;

suspending, by the scheduler service, a particular one of the plurality of pre-existing virtual machines in the run queue in order to free up capacity for the virtual machine, and inserting the virtual machine in a particular position in the run queue to arrange a set of virtual machines in the run queue into an updated order provided by the trained assignment algorithm that is trained to optimize the cost function; and

executing the virtual machine and the pre-existing virtual machines according to the updated order of the run queue.