US 11,997,021 B1
	Automated provisioning techniques for distributed applications with independent resource management at constituent services
Satya Naga Satis Kumar Gunuputi Alluri Venka, Sammamish, WA (US); John Baker, Bellevue, WA (US); Shahab Shekari, Seattle, WA (US); Kartik Natarajan, Shoreline, WA (US); Ruhaab Markas, The Colony, TX (US); Ganesh Kumar Gella, Redmond, WA (US); and Santosh Kumar Ameti, Bellevue, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Mar. 30, 2023, as Appl. No. 18/193,502.
Int. Cl. H04L 47/762 (2022.01)

CPC H04L 47/762 (2013.01)

20 Claims

1. A system, comprising:

one or more computing devices;

wherein the one or more computing devices include instructions that upon execution on or across the one or more computing devices:

receive a particular client request at a request fulfillment coordinator of a particular service of a distributed computing environment, wherein in accordance with a service-oriented architecture the particular service utilizes a plurality of auxiliary services to fulfill client requests, including a first auxiliary service and a second auxiliary service, and wherein resources of individual ones of the auxiliary services are managed by respective resource managers;

in response to determining, by the request fulfillment coordinator, using a first throttling limit associated with a throttling key of the particular client request, that a scale-out analysis criterion has been satisfied, cause, by the request fulfillment coordinator, a scale-out analysis request associated with the throttling key to be obtained at a scaling orchestrator;

determine, by the scaling orchestrator, a peak workload metric associated with the throttling key;

based at least in part on analysis of the peak workload metric, cause, by the scaling orchestrator, a scale-out requirement associated with the throttling key to be obtained at a plurality of resource managers, including a first resource manager of the first auxiliary service and a second resource manager of the second auxiliary service;

initiate, by the first resource manager, a first set of resource provisioning tasks to fulfill the scale-out requirement associated with the throttling key, wherein the first set of resource provisioning tasks comprises adding a first amount of request processing capacity to the first auxiliary service;

initiate, by the second resource manager, asynchronously with respect to the first set of resource provisioning tasks, a second set of resource provisioning tasks to fulfill the scale-out requirement associated with the throttling key, wherein the second set of resource provisioning tasks comprises adding a second amount of request processing capacity to the second auxiliary service; and

update, by the scaling orchestrator in response to determining that the first set of resource provisioning tasks and the second set of resource provisioning tasks have been completed, the first throttling limit to a second throttling limit which exceeds the first throttling limit; and

utilize, by the request fulfillment coordinator, the second throttling limit to determine whether to accept an additional client request associated with the throttling key.