US 12,013,937 B1
Detection and identification of malware using a hierarchical evolutionary tree
Jonathan James Oliver, Melbourne (AU); Chia-Yen Chang, Taipei (TW); Wen-Kwang Tsao, Taipei (TW); Joseph Cepe, Mansfield, TX (US); Maria Estella Manly Reyes, Pasig (PH); Paul Christian D. Pajares, Manila (PH); Jayson Pryde, Manila (PH); Chiaming Chiang, Taipei (TW); Brandon Niemczyk, Hutton, TX (US); and Leslie Zsohar, Liberty Hill, TX (US)
Assigned to Trend Micro Incorporated, Tokyo (JP)
Filed by Trend Micro Incorporated, Tokyo (JP)
Filed on Jul. 29, 2021, as Appl. No. 17/388,191.
Application 17/388,191 is a continuation of application No. 16/430,758, filed on Jun. 4, 2019, abandoned.
Int. Cl. G06F 21/56 (2013.01); G06F 16/901 (2019.01); G06F 16/906 (2019.01); H04L 9/06 (2006.01)
CPC G06F 21/564 (2013.01) [G06F 16/9027 (2019.01); G06F 16/906 (2019.01); G06F 21/561 (2013.01); G06F 21/565 (2013.01); G06F 21/568 (2013.01); H04L 9/0643 (2013.01)] 15 Claims
OG exemplary drawing
 
1. A method of detecting malware using a hierarchical evolutionary tree, the method comprising:
generating, using a locality sensitive hashing function, a plurality of digests of sample files, the sample files including samples of malware;
grouping the plurality of digests into a plurality of clusters by performing a recursive clustering process that identifies a pair of digests of the plurality of digests that are not in a same cluster and are closest in terms of distance relative to other digests of the plurality of digests, puts the pair of digests in a same cluster when a distance between the pair of digests is not greater than a first distance threshold, and stops the recursive clustering process when the distance between the pair of digests is greater than the first distance threshold;
grouping the plurality of clusters into a plurality of nodes;
generating a hierarchical evolutionary tree by connecting the plurality of nodes in hierarchical order;
receiving a target digest of a target file; and
placing the target digest in a particular cluster in a particular node in the hierarchical evolutionary tree to determine whether the target file is malware.