US 12,462,028 B2
Ransomware detection accuracy based on machine learning analysis of filename extension patterns
Sanath Kumar, Woodbridge, NJ (US); Adwait Vinay Thattey, Boisar (IN); Arun Prasad Amarendran, Manalapan, NJ (US); and Brian F. Brockway, Shrewsbury, NJ (US)
Assigned to Commvault Systems, Inc., Tinton Falls, NJ (US)
Filed by Commvault Systems, Inc., Tinton Falls, NJ (US)
Filed on Feb. 26, 2024, as Appl. No. 18/586,618.
Prior Publication US 2025/0272398 A1, Aug. 28, 2025
Int. Cl. G06F 21/56 (2013.01); G06F 11/14 (2006.01)
CPC G06F 21/565 (2013.01) [G06F 11/1458 (2013.01); G06F 2201/80 (2013.01); G06F 2221/034 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A system comprising:
a first computing device comprising one or more hardware processors and computer memory carrying computer programming instructions that, when executed by the one or more hardware processors, configure the first computing device to perform a first backup job of a file system that comprises primary data, wherein the first backup job comprises:
in a scan of the file system, count how many of each distinct filename extension were found;
store a count of each distinct filename extension found in the scan of the file system in a database configured at the first computing device;
cause a copy of the database to be stored at a second computing device;
after one or more backup copies of the primary data are generated:
identify, based on information in the database, a plurality of first counts, wherein each first count corresponds to a distinct filename extension found in the scan of the file system,
identify, based on information in the database, a plurality of second counts, wherein each second count corresponds to a distinct filename extension found in a scan of the file system that was performed in an earlier backup job before the first backup job,
determine a first number of potential renames of filename extensions as compared to the earlier backup job,
use a machine learning model to determine whether the first number of potential renames of filename extensions conforms to a pattern generated by the machine learning model for the file system, wherein the pattern is based on a history of filename extension counts taken in past backup jobs of the file system that precede the first backup job,
based at least in part on determining that the first number does not conform to the pattern, generate an anomaly alert before completing the first backup job.