| CPC H04L 45/127 (2013.01) [H04L 45/28 (2013.01)] | 20 Claims |

|
1. A method for managing operation of a distributed system comprising a data center and edge devices, the method comprising:
obtaining a health status request to assess a health status of an edge device of the edge devices;
identifying a final portion of the edge devices that are similar to the edge device by at least performing a lookup, based on precomputed similarities, to obtain a list of the final portion of the edge devices, the lookup being performed using a data structure that associates portions of the edge devices based on environmental locations, hosted hardware components, and typical workloads;
for a first edge device of the final portion of the edge devices, obtaining metrics for the first edge device;
obtaining differences between the metrics for the first edge device and metrics for the edge device;
obtaining an aggregate difference between the edge device and the final portion of the edge devices using, at least in part, the differences;
making a determination regarding whether the aggregate difference meets a criteria that when met indicates that the edge device is in an unhealthy state;
in a first instance of the determination where the aggregate difference meets the criteria:
concluding that the edge device has an unhealthy health status;
updating the operation of the distributed system based on the unhealthy health status of the edge device to obtain an updated distributed system; and
providing computer implemented services using the updated distributed system.
|