US 11,855,849 B1
Artificial intelligence based self-organizing event-action management system for large-scale networks
Melissa Elaine Davis, Edmonds, WA (US); Renaud Bordelet, Dublin (FR); Charles Alexander Carman, Seattle, WA (US); David Elfi, Bothell, WA (US); Anton Vladilenovich Goldberg, San Jose, CA (US); Kyle Bradley Peterson, Seattle, WA (US); and Christopher Allen Suver, Seattle, WA (US)
Assigned to Amazon Technologies, Inc., Seattle, WA (US)
Filed by Amazon Technologies, Inc., Seattle, WA (US)
Filed on Sep. 26, 2017, as Appl. No. 15/716,201.
Int. Cl. H04L 41/12 (2022.01); G06N 20/00 (2019.01); G06N 5/046 (2023.01); H04L 67/12 (2022.01)
CPC H04L 41/12 (2013.01) [G06N 5/046 (2013.01); G06N 20/00 (2019.01); H04L 67/12 (2013.01)] 20 Claims
OG exemplary drawing
 
1. An artificial intelligence-based system for resource management of a network, comprising:
one or more computing devices comprising one or more processors that implement a collection of rule processing units organized into a plurality of layers, including (a) a leaf layer comprising a plurality of rule processing units and (b) at least one non-leaf layer comprising one or more rule processing units, wherein an individual rule processing unit is implemented at one or more of the one or more computing devices,
wherein a first rule processing unit of the leaf layer is configured to:
receive an indication of a first rule set to be implemented at the first rule processing unit in response to observations collected from one or more sensors associated with a particular monitored device of the network;
apply a rule of the first rule set to a first observation generated by a first sensor of the one or more sensors, wherein application of the first rule results in an initiation of a first corrective action;
cause a first set of metadata comprising respective indications of (a) the first observation and (b) the first corrective action to be stored at one or more data stores; and
in response to a determination, based at least in part on a second observation generated by the first sensor after the first corrective action has been initiated, that the initiated first corrective action did not meet a success criterion, transmit an escalation message to at least one rule processing unit at a non-leaf layer that is hierarchically above the leaf layer; and
wherein the at least one rule processing unit at the non-leaf layer is configured to determine, responsive to receipt of the escalation message from the first rule processing unit of the leaf layer, whether an additional corrective action should be initiated; and
one or more event-action record analyzers implemented at one or more other computing devices separate from the one or more computing devices that implement the collection of rule processing units, the one or more event-action record analyzers including a first event-action analyzer configured to:
identify an input data set for one or more machine learning models trained to evaluate rule sets, wherein the input data set comprises a plurality of sets of metadata generated at the collection of rule processing units, including the first set of metadata stored at the one or more data stores by the first rule processing unit of the leaf layer, wherein the first set of metadata is based on the application by the first rule processing unit of the rule of the first rule set to the first observation;
identify, using the input data set including the first set of metadata obtained from the one or more data stores, and using the one or more machine learning models, one or more rule modification recommendations including a first rule modification recommendation to modify the rule applied to the first observations; and
cause the first rule modification recommendation to be propagated to one or more rule processing units in the plurality of layers of the collection.