US 11,886,586 B1
Malware families identification based upon hierarchical clustering
Yin-Ming Chang, Taipei (TW); Hsing-Yun Chen, Taipei (TW); Hsin-Wen Kung, Taipei (TW); Li-Chun Sung, Taipei (TW); and Si-Wei Wang, Taipei (TW)
Assigned to Trend Micro, Inc., Tokyo (JP)
Filed by Trend Micro Inc., Tokyo (JP)
Filed on Mar. 6, 2020, as Appl. No. 16/811,651.
Int. Cl. G06F 21/56 (2013.01); G06F 9/54 (2006.01); G06F 18/23213 (2023.01)
CPC G06F 21/566 (2013.01) [G06F 9/54 (2013.01); G06F 18/23213 (2023.01); G06F 21/568 (2013.01)] 20 Claims
OG exemplary drawing
 
1. A method of classifying a suspicious file, said method comprising:
determining a plurality of prototype feature vectors, each prototype feature vector having an associated group of feature vectors;
merging said groups of feature vectors into clusters without using a fixed-distance threshold, each of said clusters representing an identified malware family;
creating a feature vector for a behavior report of said suspicious file, said feature vector representing API (application programming interface) calls of said suspicious file;
determining a distance between said feature vector and one of said prototype feature vectors having a first malware family name;
when it is determined that said distance is less than a fixed-distance classification threshold, determining that said suspicious file belongs to said first malware family name; and
taking an action based upon said suspicious file belonging to said first malware family name.