CPC G06F 11/0793 (2013.01) [G06F 11/0772 (2013.01); G06F 11/079 (2013.01); G06F 11/3034 (2013.01); G06F 18/214 (2023.01); G06F 18/2178 (2023.01); G06F 18/24323 (2023.01); G06N 20/00 (2019.01)] | 18 Claims |
1. A computer-implemented method comprising:
generating, using a processor, an alert regarding an entity in a system, wherein the system includes a plurality of storage groups including a first storage group of logical devices and including one or more other storage groups of logical devices, wherein each of the plurality of storage groups includes logical devices with corresponding storage capacity configured from physical storage devices of the system, wherein the entity is the first storage group;
performing, using a processor, anomaly detection on the alert to determine whether the alert indicates a first anomaly with respect to historical data for a specified metric for the entity;
responsive to determining that the alert indicates the first anomaly with respect to the historical data for the specified metric for the entity, performing first processing including:
receiving, using a processor, notification of the alert regarding the entity in the system;
responsive to receiving the notification, performing second processing including:
receiving, using a processor, one or more expected causes of the alert wherein the one or more expected causes are determined in accordance with one or more rules and in accordance with an input including one or more metrics characterizing any of quality of service, performance, resource usage and workload of one or more entities of the system, wherein the input includes first configuration information identifying first shared resources that are shared by the first storage group and by the one or more other storage groups, wherein the first shared resources include a first storage system port, wherein first logical devices of the first storage group and second logical devices of the one or more other storage groups are exposed over a same port, the first storage system port, such that first I/Os directed to the first logical devices and second I/Os directed to the second logical devices are both received at the first storage system port, wherein the alert includes an adverse performance violation for the first storage group, wherein the one or more expected causes of the adverse performance violation include a first expected cause identifying a second storage group of the one or more other storage groups as a noisy neighbor of the first storage group where a respective I/O workload of the second storage group is an expected cause of the adverse performance violation of the first storage group;
receiving, using a processor, one or more remediations that are determined in accordance with the one or more expected causes of the alert, wherein the one or more remediations denote one or more corresponding actions that alleviate or remove the one or more expected causes of the alert wherein the one or more remediations include a first remediation that alleviates or removes the first expected cause of the adverse performance violation; and
implementing the first remediation in the system, including reducing a first host I/O limit setting for the second storage group to thereby decrease a maximum allowable front-end or host I/O rate of the second storage group on the system;
and
responsive to determining that the alert does not indicate the first anomaly with respect to the historical data for the specified metric for the entity, determining using a processor that the alert denotes a false alarm and not generating said notification of the alert.
|