CPC G06F 9/5044 (2013.01) [G06F 9/45558 (2013.01); G06F 17/15 (2013.01); G06N 20/00 (2019.01); G06T 1/20 (2013.01); G06F 2009/4557 (2013.01)] | 20 Claims |
1. A computer-implemented method, comprising:
identifying, by a scheduler service executed by at least one processor, a predetermined set of assignment algorithms;
modifying, by the scheduler service, at least one of the predetermined set of assignment algorithms to generate a plurality of trained assignment algorithms that are trained to maximize a cost function comprising: a ratio of an average execution time for a plurality of virtual machines, and an average total time corresponding to execution time and run queue wait time for the plurality of virtual machines;
generating, by the scheduler service, a data structure that correlates, using the cost function, a particular one of the plurality of trained assignment algorithms with a plurality of graphics configuration parameters;
identifying, by the scheduler service, a virtual machine that is assigned a virtual graphics processing unit (vGPU) profile from an arrival queue;
identifying, by the scheduler service, a graphics configuration of a system comprising a plurality of host computers, the graphics configuration specifying a total number of vGPU-enabled graphics processing units (GPUs) installed in the plurality of host computers in the system and a virtual machine arrival rate for the arrival queue of the system;
determining that an existing run queue of a vGPU-enabled GPU of the system matches the vGPU profile of the virtual machine;
receiving, by the scheduler service, data specifying a plurality of pre-existing virtual machines in the existing run queue of the vGPU-enabled GPU of the system;
selecting, by the scheduler service, the particular one of the trained assignment algorithms that is correlated, in the data structure, with the total number of vGPU-enabled GPUs and the virtual machine arrival rate specified by the graphics configuration of the system;
suspending, by the scheduler service, a particular one of the plurality of pre-existing virtual machines in the run queue in order to free up capacity for the virtual machine, and inserting the virtual machine in a particular position in the run queue to arrange a set of virtual machines in the run queue into an updated order provided by the trained assignment algorithm that is trained to optimize the cost function; and
executing the virtual machine and the pre-existing virtual machines according to the updated order of the run queue.
|