US 11,900,169 B1
Inference flow orchestration service
Anand Dhandhania, Seattle, WA (US); and Thomas Loockx, Issaquah, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Apr. 14, 2021, as Appl. No. 17/230,784.
Int. Cl. G06F 9/46 (2006.01); G06F 9/50 (2006.01); G06F 16/901 (2019.01); G06N 20/00 (2019.01)
CPC G06F 9/505 (2013.01) [G06F 16/9024 (2019.01); G06N 20/00 (2019.01)] 20 Claims
OG exemplary drawing
 
1. A system, comprising:
one or more computing devices;
wherein the one or more computing devices include instructions that upon execution on or across the one or more computing devices cause the one or more computing devices to:
obtain, via one or more programmatic interfaces, respective descriptors of a plurality of machine learning tasks to be run to respond to a content analysis request of a first type, wherein individual ones of the respective descriptors indicate (a) program code of a machine learning task, (b) one or more metrics to be collected with respect to the machine learning task, (c) one or more categories of runtime environments for executing the machine learning task, and (d) one or more retry criteria for the machine learning task;
store a graph representation of the plurality of machine learning tasks, wherein the graph comprises a plurality of nodes and a plurality of edges, wherein a first node of the plurality of nodes represents a first machine learning task, wherein a second node of the plurality of nodes represents a second machine learning task, and wherein an edge linking the first node to the second node indicates that an output data type of the first node is compatible with an input data type of the second node;
in response to determining that a particular content analysis request of the first type has been received,
cause an orchestrator comprising one or more threads of execution to: (a) obtain a respective result from individual ones of the plurality of machine learning tasks, wherein a particular runtime environment at which an individual machine learning task is executed belongs to a category of the one or more categories indicated in the descriptor of the individual machine learning task, (b) in response to determining that a result of a particular machine learning task of the plurality of machine learning tasks satisfies a retry criterion indicated in the descriptor of the particular machine learning task, cause the particular machine learning task to be re-executed, (c) identify one or more machine learning tasks to which a result of the particular machine learning task is to be provided as input, (d) transmit the result of the particular machine learning task as input to individual ones of the one or more machine learning tasks, and (e) cause a response to the particular content analysis request to be transmitted to one or more destinations, wherein the response comprises results of at least some machine learning tasks of the plurality of machine learning tasks; and
cause to be presented, via the one or more programmatic interfaces, (a) a visual representation of the graph, indicating respective categories of runtime environments utilized for individual ones of the plurality of machine learning tasks, (b) one or more metrics indicated in the descriptors of individual ones of the plurality of machine learning tasks and (c) one or more retry statistics indicative of a number of re-executions of individual ones of the machine learning tasks in accordance with respective retry criteria.