| CPC G06F 9/5044 (2013.01) [G06F 9/45558 (2013.01); G06F 9/5016 (2013.01); G06F 2009/45579 (2013.01); G06F 2009/45583 (2013.01)] | 20 Claims |

|
1. A non-transitory computer-readable medium comprising machine readable instructions, wherein the instructions, when executed by at least one processor, cause at least one computing device to perform operations comprising:
executing a scheduling service in a computing environment comprising one or more host computers, each of the one or more host computers having a virtualization layer that provides virtualized hardware for one or more virtualized computing instances (VCI);
identifying, by the scheduling service, a plurality of graphics processing units (GPUs) in a computing environment, wherein each of the plurality of GPUs is configured with a virtual GPU (vGPU) profile comprising a memory reservation that represents a maximum GPU memory requirement that the respective GPU will support with that respective configured vGPU profile;
sorting, by the scheduling service, a first list of the plurality of configured GPUs in increasing order of the memory requirement of the vGPU profile of each configured GPU;
receiving, by the scheduling service, a plurality of graphics processing requests, each respective graphics processing request comprising a GPU memory requirement:
sorting, by the scheduling service, a second list of the plurality of graphics processing requests according to a vGPU request placement model of a memory requirement of each respective graphics processing request;
determining, by the scheduling service and with the vGPU request-placement model that considers the respective GPU memory requirement of each graphics processing request and the respective memory reservation of the respective vGPU profile of each configured GPU, that a first configured GPU in the sorted first list has a memory reservation that meets a memory requirement of a first memory request in the sorted second list; and
assigning, based on a determination that the first configured GPU in the sorted first list has a memory reservation that meets a memory requirement of the first memory request in the sorted second list, the first memory request to the first configured GPU.
|