US 11,989,586 B1
	Scaling up computing resource allocations for execution of containerized applications
Archana Srikanta, Bellevue, WA (US); Onur Filiz, Redmond, WA (US); Prashant Prahlad, Seattle, WA (US); Amit Gupta, Bellevue, WA (US); and Song Hu, Sammamish, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Jun. 30, 2021, as Appl. No. 17/305,143.
Int. Cl. G06F 9/48 (2006.01); G06F 8/60 (2018.01); G06F 9/455 (2018.01); G06F 9/50 (2006.01); H04L 67/02 (2022.01); H04L 67/30 (2022.01)

CPC G06F 9/5005 (2013.01) [G06F 8/60 (2013.01); G06F 9/455 (2013.01); G06F 9/45558 (2013.01); G06F 9/48 (2013.01); G06F 9/4806 (2013.01); G06F 9/4843 (2013.01); G06F 9/485 (2013.01); G06F 9/4881 (2013.01); G06F 9/50 (2013.01); G06F 9/5011 (2013.01); G06F 9/5022 (2013.01); G06F 9/5027 (2013.01); G06F 9/505 (2013.01); G06F 9/5061 (2013.01); G06F 9/5072 (2013.01); G06F 9/5077 (2013.01); G06F 9/5083 (2013.01); H04L 67/02 (2013.01); H04L 67/30 (2013.01)]

20 Claims

1. A cloud provider system comprising:

a serverless container management service configured to acquire compute capacity and execute a task using the acquired compute capacity in response to a request to execute the task; and

an application execution service in networked communication with the serverless container management service,

wherein the application execution service is configured to at least:

receive a request to deploy a Web application configured to handle a plurality of HTTP requests; and

transmit, to the serverless container management service, a task execution request to execute a set of tasks to be used to implement the Web application, wherein the set of tasks includes at least a first task and a second task that are each configured to handle a subset of the plurality of HTTP requests directed to the Web application, and wherein the task execution request indicates that each of the set of tasks is to be allocated a first amount of computing resources,

wherein the serverless container management service is further configured to at least:

in response to the task execution request from the application execution service, cause the set of tasks to be executed, wherein each task of the set of tasks is placed in a standby mode in which a second amount of computing resources that is less than the first amount of computing resources is allocated to said each task of the set of tasks; and

in response to an HTTP request routed to the first task, cause the HTTP request to be executed by the first task,

wherein the application execution service is further configured to at least:

based at least in part on the first task associated with the Web application being at a first priority level and the second task associated with the Web application being at a second priority level lower than the first priority level, route a set of HTTP requests directed to the Web application to the first task, such that the second task continues to stay in the standby mode with the second amount of computing resources allocated thereto while the first task is processing the set of HTTP requests with the first amount of computing resources allocated thereto that is greater than the second amount of computing resources; and

based at least in part on the first task not having enough capacity to handle an additional HTTP request directed to the Web application, route the additional HTTP request to the second task, and

wherein the serverless container management service is further configured to at least:

in response to the additional HTTP request routed to the second task, cause the additional request to be processed by the second task with a reduced amount of computing resources that is less than the first amount; and

based on a utilization level associated with the second task, cause an increased amount of computing resources that is greater than the reduced amount to be allocated to the second task.