US 12,436,814 B1
Resource right-sizing for compute clusters
Shaokang Ni, Seattle, WA (US); Siyu Wang, Seattle, WA (US); Letian Feng, Clyde Hill, WA (US); Malcolm Featonby, Sammamish, WA (US); Nathaniel Baird Jones, Carnation, WA (US); and Zachary Daniel Casper, Austin, TX (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Dec. 16, 2022, as Appl. No. 18/067,623.
Int. Cl. G06F 9/44 (2018.01); G06F 9/50 (2006.01); G06F 11/34 (2006.01)
CPC G06F 9/5055 (2013.01) [G06F 11/3409 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
one or more processors; and
a memory storing program instructions that, when executed on the one or more processors, implement a right-sizing service configured to:
compute one or more resource sizes for respective nodes of a computing cluster executing an application, wherein to compute the one or more resources sizes the right-sizing service is configured to:
collect performance metrics for respective ones of a plurality of heterogeneous nodes of the computing cluster executing the application;
normalize the collected performance metrics for the respective ones of the plurality of heterogeneous nodes according to respective computational benchmarks for the respective ones of a plurality of heterogeneous nodes;
identify one or more of the respective ones of the plurality of heterogeneous nodes according to the normalized performance metrics for the respective ones of a plurality of heterogeneous nodes as indicative of node utilization for the application;
temporally align respective time series data of respective ones of the plurality of heterogeneous nodes indicative of node utilization for the application to identify a plurality of phases of execution of the application;
estimate respective resource sizes for respective ones of the plurality of phases of execution of the application; and
identify a slowest node of the plurality of heterogeneous nodes; and
provide a recommendation comprising the one or more resource sizes based at least in part on the respective estimated resource sizes for respective ones of the plurality of phases of execution of the application and the identified slowest node.