US 12,379,971 B2
Reliability-aware resource allocation method and apparatus in disaggregated data centers
Chao Guo, Hong Kong (HK); Xinyu Wang, Hong Kong (HK); and Moshe Zukerman, Hong Kong (HK)
Assigned to City University of Hong Kong, Hong Kong (HK)
Filed by City University of Hong Kong, Hong Kong (HK)
Filed on Jan. 28, 2022, as Appl. No. 17/586,818.
Prior Publication US 2023/0244545 A1, Aug. 3, 2023
Int. Cl. G06F 9/50 (2006.01)
CPC G06F 9/5083 (2013.01) 8 Claims
OG exemplary drawing
 
1. A disaggregated data center (DDC), comprising:
a plurality of working nodes, each of the working nodes comprises one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprising: central processing unit (CPU), graphical processing unit (GPU), transient memory circuitry, and non-transient memory circuitry;
a plurality of backup nodes, each of the backup nodes comprising one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprising: a central processing unit (CPU), a graphical processing unit (GPU), transient memory circuitry, and non-transient memory circuitry;
a first processor configured to execute a reliability model to determine an achievable reliability for a service request to the DDC; and
a second processor configured to execute an integer linear programming (ILP) model to perform a resource allocation for the service request to the DDC;
wherein the DDC comprises multiple computing resource types, and the execution of the service request received by the DDC requires performance of computing resource of at least one of the computing resource types; and
wherein nodes of same computing resource type are configured to form a parallel system such that as long as at least one of the nodes in the parallel system is available, the parallel system is available for performance in the execution of the service request received by the DDC;
wherein each service request received by the DDC is executed by one or more working nodes corresponding to one or more necessary computing resource type respectively for an execution of the service request; and if a reliability of the one or more working nodes is lower than a reliability requirement for the service request, the service request is executed by one or more backup nodes corresponding to the one or more necessary computing resource types respectively;
wherein the performance of the ILP model comprises:
maximizing total number of service requests received by the DDC accepted for execution;
minimizing number of the accepted service requests allocated with backup nodes; and
subjecting to one or more of constraints comprising:
a working node and a backup node allocated to the service request do not share a same computing resource;
if the reliability of the one or more working nodes allocated to the service request is equal or higher than the reliability requirement for the service request, the service request is accepted for execution regardless of whether one or more backup nodes are allocated to the service request;
a total resource demand of all service requests allocated with a node and pending for execution is not higher than a resource capacity of the node; and
a reliability of computing resources of a computing resource type in the DDC must equal or higher than the reliability requirement of the service request before its acceptance for execution.