US 12,271,161 B2
Method and apparatus for optimizing operation simulation of data center
Hanchen Zhou, Beijing (CN); Qingshan Jia, Beijing (CN); and Xiao Hu, Beijing (CN)
Assigned to Tsinghua University, Beijing (CN)
Filed by Tsinghua University, Beijing (CN)
Filed on Jan. 4, 2024, as Appl. No. 18/404,301.
Claims priority of application No. 202310006010.5 (CN), filed on Jan. 4, 2023.
Prior Publication US 2024/0248440 A1, Jul. 25, 2024
Int. Cl. G05B 13/04 (2006.01); G05B 13/02 (2006.01)
CPC G05B 13/048 (2013.01) [G05B 13/0265 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method for optimizing operation simulation of a data center, comprising:
constructing, by a processor, a data center simulation model, wherein the data center simulation model comprises a first state prediction model and a second state prediction model;
the data center simulation model is configured to provide a simulation environment for a reinforcement learning algorithm, and a precision of the first state prediction model is less than that of the second state prediction model;
acquiring, from the data center, a state data set of the data center and an action data set of the data center, wherein the state data set of the data center comprises state data of the simulated data center at any moment, and the action data set of the data center comprises action data generated according to an action generation rule; wherein the state data set of the data center comprises at least one selected from the following state data: a temperature of a measuring point of each cold aisle in the data center, a temperature of a measuring point of each hot aisle, a fan speed of an air-conditioner in a machine room, a supply water temperature of cooling water, a return water temperature of the cooling water, a supply water temperature of chilled water and a return water temperature of the chilled water; and the action data set of the data center comprises at least one selected from the following action data: a return air temperature of each air-conditioner, a frequency of a cooling pump and a frequency of a chilled pump;
inputting, by the processor, the state data set of the data center and the action data set of the data center into the first state prediction model, to obtain a next state data set predicted by the first state prediction model after executing the action data in the action data set of the data center;
judging, by the processor, based on a preset state safe judgment condition, whether the next state data set predicted by the first state prediction model meets the state safe judgment condition that is set based on a prediction precision of the first state prediction model;
inputting, by the processor, if the next state data set predicted by the first state prediction model meets the state safe judgment condition, the state data set of the data center and the action data set of the data center into the second state prediction model, to obtain a next state data set predicted by the second state prediction model after executing the action data in the action data set of the data center;
optimizing, by the processor, a network parameter of the reinforcement learning algorithm using the next state data set predicted by the second state prediction model, the state data set of the data center, and the action data set of the data center, to obtain a trained reinforcement learning algorithm;
determining, by the processor, an action data set corresponding to a real-time state data set of the data center by using the trained reinforcement learning algorithm, and determining the action data set corresponding to the real-time state data set as a control strategy of the data center; and
controlling, by the processor and based on the control strategy of the data center, the air- conditioner, the cold aisle, the hot aisle, the cooling pump and the chilled pump of the data center.