CPC G06F 11/1461 (2013.01) [G06F 11/076 (2013.01); G06F 11/0769 (2013.01); G06F 11/0772 (2013.01); G06F 11/0775 (2013.01); G06F 11/0781 (2013.01); G06F 11/079 (2013.01); G06F 11/143 (2013.01); G06F 11/1458 (2013.01); G06F 2201/81 (2013.01)] | 17 Claims |
1. A method comprising:
receiving a plurality of assets to associate to a data protection policy;
receiving configuration information for the data protection policy, the configuration information comprising a data protection job to perform for the assets, and a schedule for the data protection job;
generating a shadow policy comprising the configuration information from the data protection policy, and a retry protocol;
performing the data protection job according to the schedule in the data protection policy;
detecting a failure of the data protection job for an asset associated with the data protection policy;
moving the asset out of the data protection policy and into the shadow policy;
retrying the data protection job for the asset according to the retry protocol in the shadow policy;
determining that a threshold number of retries has been reached without the data protection job having been successfully completed;
upon the determination, collecting a plurality of logs maintained by a plurality of services involved with the data protection job, the logs having recorded a set of events, timestamps when the events occurred, and severity levels for the events;
dividing a length of time over which the retries occurred into a plurality of time intervals;
forming a plurality of timeslots corresponding to the plurality of time intervals;
grouping the plurality of events into the plurality of timeslots based on the timestamps of when the events occurred;
generating a dataset by summing, for each particular timeslot and each particular severity level, a number of events that occurred in that particular timeslot and had that particular severity level;
applying k-means clustering to the dataset to generate first and second cluster sets;
identifying one of the first or second cluster sets as being a target log segment based on the one of the first or second cluster sets having a greater number of events with higher severity levels than another of the first or second cluster sets; and
reporting, to a user, the target log segment.
|