US 11,853,853 B1
Providing human-interpretable explanation for model-detected anomalies
Jocelyn Beauchesne, Saint-Lormel (FR); John Lim Oh, Mukilteo, WA (US); Vasudha Shivamoggi, Cambridge, MA (US); and Roy Donald Hodgman, Cambridge, MA (US)
Assigned to Rapid7, Inc., Boston, MA (US)
Filed by Rapid7, Inc., Boston, MA (US)
Filed on Dec. 31, 2020, as Appl. No. 17/139,812.
Application 17/139,812 is a continuation in part of application No. 17/024,481, filed on Sep. 17, 2020.
Application 17/024,481 is a continuation in part of application No. 17/024,506, filed on Sep. 17, 2020, granted, now 11,509,674.
Claims priority of provisional application 62/901,991, filed on Sep. 18, 2019.
Int. Cl. H04L 9/00 (2022.01); G06N 20/00 (2019.01); H04L 9/40 (2022.01); G06F 18/2113 (2023.01); G06F 18/2132 (2023.01); G06F 18/2433 (2023.01)
CPC G06N 20/00 (2019.01) [G06F 18/2113 (2023.01); G06F 18/2132 (2023.01); G06F 18/2433 (2023.01); H04L 63/1433 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
one or more computing devices that implement an anomaly detection system, configured to:
execute one or more anomaly detection models on a dataset of observation records about processes executed on individual machines to determine (a) an outlier record in the dataset, and (b) an outlier score of the outlier records, wherein the one or more anomaly detection models are trained using one or more machine learning techniques;
for individual ones of a plurality of features in the outlier record:
generate, based on the outlier record, a synthetic observation record as a comparison record to use to determine an influence of the feature on the outlier score, wherein the generation retains one or more features from the outlier record in the comparison record and modifies the feature in the comparison record;
execute the one or more anomaly detection models on the comparison record to obtain another outlier score; and
determine, based on the outlier score and the other outlier score, an influence metric indicating the influence of the feature on the outlier score; and
output a model interpretation result of the outlier record, wherein the model interpretation result indicates a subset of the features with highest influence metrics on the outlier score.