| CPC G06F 1/329 (2013.01) | 20 Claims |

|
1. A method, comprising:
dynamically measuring, by a processor set, latency for a plurality of functions with a plurality of corresponding frequency levels within a serverless computing cluster;
measuring, by the processor set, a transition latency from an idle state to an active state for the plurality of functions;
determining, by the processor set, whether a target response time to perform a service level objective (SLO) within the serverless computing cluster is going to be missed;
dynamically reallocating, by the processor set, at least one core and changing a frequency level across the plurality of functions by scaling down in response to a determination that the target response time to perform the SLO within the serverless computing cluster is going to be met; and
dynamically reallocating, by the processor set, the at least one core and changing the frequency level across the plurality of functions by scaling up in response to a determination that the target response time to perform the SLO within the serverless computing cluster is going to be missed,
wherein the dynamically reallocating the at least one core and changing the frequency level across the plurality of functions by scaling up comprises creating a new container instance for executing the plurality of functions on a different node than a node which includes the at least one core and degrading a performance of a function of the plurality of functions which has a lowest priority class of the plurality of functions, and
the determining whether the target response time to perform the SLO within the serverless computing cluster is going to be missed is based on execution time, queue length, number of cores, and the SLO with minimum power.
|