US 12,143,407 B2
Anomaly and ransomware detection
Oscar Annen, San Jose, CA (US); Di Wu, Newark, CA (US); and Ajay Saini, Mountain View, CA (US)
Assigned to Rubrik, Inc., Palo Alto, CA (US)
Filed by Rubrik, Inc., Palo Alto, CA (US)
Filed on Aug. 7, 2019, as Appl. No. 16/534,479.
Prior Publication US 2021/0044604 A1, Feb. 11, 2021
Int. Cl. H04L 29/06 (2006.01); G06F 11/14 (2006.01); G06N 20/00 (2019.01); H04L 9/40 (2022.01)
CPC H04L 63/1425 (2013.01) [G06F 11/1464 (2013.01); G06N 20/00 (2019.01); H04L 63/1416 (2013.01); H04L 63/1466 (2013.01); G06F 2201/84 (2013.01)] 18 Claims
OG exemplary drawing
 
1. A system, comprising:
a storage device configured to store one or more snapshots of a primary machine; and
one or more processors in communication with the storage device and a production system, the one or more processors configured to perform anomaly and ransomware detection operations, comprising:
taking a first snapshot and a second snapshot of the primary machine and storing the first snapshot and the second snapshot in the storage device;
generating a first differential filesystem metadata (diff FMD) file by sampling from a first set of metadata files of a first seed dataset that simulates normal operation of the primary machine, and a second diff FMD file by sampling from a second set of metadata files of a second seed dataset that simulates a ransomware infection, the first diff FMD file and the second diff FMD file being indicative of one or more changes of at least one file of the primary machine occurring between the first snapshot and the second snapshot, the one or more changes of the at least one file being represented by respective changes in snapshot-based filesystem metadata of the first diff FMD file and the second diff FMD file;
generating snapshot-based training data by merging the first diff FMD file and the second diff FMD file, wherein the merging of the first diff FMD file and the second diff FMD file simulates ransomware activity for live production data using the snapshot-based filesystem metadata, and wherein the simulated ransomware activity is based on the one or more changes in the snapshot-based filesystem metadata of the first diff FMD file and the second diff FMD file and based on the one or more changes of the at least one file of the primary machine occurring between the first snapshot and the second snapshot;
training one or more machine-learning models using the snapshot-based training data; and
generating an anomaly prediction by inputting a third diff FMD file from a third snapshot into a first machine-learning model of the one or more machine-learning models trained using the snapshot-based training data and an encryption prediction by inputting a result of the anomaly prediction of the first machine-learning model into a second machine-learning model of the one or more machine-learning models to determine whether the anomaly prediction is a ransomware encryption anomaly.