CPC G06F 21/564 (2013.01) [G06F 16/9027 (2019.01); G06F 16/906 (2019.01); G06F 21/561 (2013.01); G06F 21/565 (2013.01); G06F 21/568 (2013.01); H04L 9/0643 (2013.01)] | 15 Claims |
1. A method of detecting malware using a hierarchical evolutionary tree, the method comprising:
generating, using a locality sensitive hashing function, a plurality of digests of sample files, the sample files including samples of malware;
grouping the plurality of digests into a plurality of clusters by performing a recursive clustering process that identifies a pair of digests of the plurality of digests that are not in a same cluster and are closest in terms of distance relative to other digests of the plurality of digests, puts the pair of digests in a same cluster when a distance between the pair of digests is not greater than a first distance threshold, and stops the recursive clustering process when the distance between the pair of digests is greater than the first distance threshold;
grouping the plurality of clusters into a plurality of nodes;
generating a hierarchical evolutionary tree by connecting the plurality of nodes in hierarchical order;
receiving a target digest of a target file; and
placing the target digest in a particular cluster in a particular node in the hierarchical evolutionary tree to determine whether the target file is malware.
|