| CPC G06F 9/5027 (2013.01) | 8 Claims |

|
1. A method for optimizing resource allocation in a computer cluster based on prediction with reinforcement learning, implemented by a processor, comprising the steps of:
a) providing a prediction on the number of units of a hardware resource needed for a workload in more than N timepoints after a 0-th timepoint to the processor, wherein there are maximum M units of the source available for provisioning and Ui is the number of units needed at the i-th timepoint according to the prediction, and N, M and i are positive integer;
b) calculating at least one 0-th possible operation cost (POC0) based on at least one possible provisioned number (PPN) at a 1-th timepoint (PPN1) ranging from U1 to M by the processor, wherein the POC0 is given by
POC0=K+RF×|PPN1−K|+PPN1,
where RF is a rebalance factor between 0 and 1, and K is a real number;
c) for each i-th timepoint with i from 1 to N:
c1) calculating at least one i-th possible operation cost (POCi), wherein the POCi is given by
POCi=POC(i−1)+RF×|PPN(i+1)−PPNi|+PPN(i+1),
where POC(i−1) is the possible operation cost(s) calculated for the (i−1)-th timepoint, PPN(i+1) is the PPN at the (i+1)-th timepoint ranging from U(i+1) to M, PPNi is the PPN at the i-th timepoint ranging from Ui to M, and PPNis used for calculating POCi and POC(i−1) have the same value;
c2) identifying the smallest and the second smallest POCi to efficiently prune search space and reduce computational complexity; and
c3) if the smallest and the second smallest POCi are calculated from the same PPNi, then setting the PPNi used to calculate the smallest POCi as an i-th assigned number, and removing the POCi(s) not calculated from the i-th assigned number for the calculation of next timepoint, thereby further reducing computational burden; and
d) provisioning an i-th assigned number of units of the hardware resource at the i-th timepoint for the workload by the processor to dynamically adjust resource allocation within the computer cluster.
|