US 12,238,129 B2
Customized anomaly detection
Congrui Huang, Redmond, WA (US); Yujing Wang, Redmond, WA (US); Bixiong Xu, Redmond, WA (US); Guodong Xing, Redmond, WA (US); Mao Yang, Redmond, WA (US); Jie Tong, Redmond, WA (US); Jing Bai, Redmond, WA (US); Meng Ai, Redmond, WA (US); and Qi Zhang, Redmond, WA (US)
Assigned to Microsoft Technology Licensing, LLC, Redmond, WA (US)
Appl. No. 17/783,240
Filed by Microsoft Technology Licensing, LLC, Redmond, WA (US)
PCT Filed Nov. 17, 2020, PCT No. PCT/US2020/060814
§ 371(c)(1), (2) Date Jun. 7, 2022,
PCT Pub. No. WO2021/141674, PCT Pub. Date Jul. 15, 2021.
Claims priority of application No. 202010013081.4 (CN), filed on Jan. 7, 2020.
Prior Publication US 2023/0029794 A1, Feb. 2, 2023
Int. Cl. H04L 9/40 (2022.01); H04L 41/16 (2022.01)
CPC H04L 63/1425 (2013.01) [H04L 41/16 (2013.01); H04L 63/20 (2013.01)] 12 Claims
OG exemplary drawing
 
1. A method for implementing customized anomaly detection, comprising:
obtaining time-series data including a plurality of data points;
performing anomaly detection on the time-series data with a first anomaly detection model;
graphically presenting anomaly detection results as a function of time, wherein a first anomaly is represented by first indications and a second anomaly different from the first anomaly is represented by second indications different from the first indications;
receiving feedback associated with an anomaly detection result of at least one data point in the time-series data, the feedback being in a form of adjustments to ones of the first indications and the second indications; and
updating the first anomaly detection model to a second anomaly detection model different from the first anomaly detection model based at least on the feedback through reinforcement learning, wherein the updating the first anomaly detection model to the second anomaly detection model comprises:
optimizing a policy network based at least on the feedback through the reinforcement learning, which includes:
calculating a policy gradient based at least on the anomaly detection result and the feedback; and
adjusting the policy network with the policy gradient; and
determining hyper-parameters through the optimized policy network.