US 12,487,659 B2
	Managing power for serverless computing
Jovan Stojkovic, Champaign, IL (US); Hubertus Franke, Cortlandt Manor, NY (US); and Alper Buyuktosunoglu, White Plains, NY (US)
Assigned to International Business Machines Corporation, Armonk, NY (US)
Filed by INTERNATIONAL BUSINESS MACHINES CORPORATION, Armonk, NY (US)
Filed on Jun. 6, 2023, as Appl. No. 18/206,283.
Prior Publication US 2024/0411357 A1, Dec. 12, 2024
Int. Cl. G06F 1/00 (2006.01); G06F 1/329 (2019.01)

CPC G06F 1/329 (2013.01)

20 Claims

1. A method, comprising:

dynamically measuring, by a processor set, latency for a plurality of functions with a plurality of corresponding frequency levels within a serverless computing cluster;

measuring, by the processor set, a transition latency from an idle state to an active state for the plurality of functions;

determining, by the processor set, whether a target response time to perform a service level objective (SLO) within the serverless computing cluster is going to be missed;

dynamically reallocating, by the processor set, at least one core and changing a frequency level across the plurality of functions by scaling down in response to a determination that the target response time to perform the SLO within the serverless computing cluster is going to be met; and

dynamically reallocating, by the processor set, the at least one core and changing the frequency level across the plurality of functions by scaling up in response to a determination that the target response time to perform the SLO within the serverless computing cluster is going to be missed,

wherein the dynamically reallocating the at least one core and changing the frequency level across the plurality of functions by scaling up comprises creating a new container instance for executing the plurality of functions on a different node than a node which includes the at least one core and degrading a performance of a function of the plurality of functions which has a lowest priority class of the plurality of functions, and

the determining whether the target response time to perform the SLO within the serverless computing cluster is going to be missed is based on execution time, queue length, number of cores, and the SLO with minimum power.