| CPC G06F 21/561 (2013.01) [G06F 21/568 (2013.01); G06N 5/02 (2013.01); G06N 5/04 (2013.01)] | 20 Claims |

|
1. A computer implemented method, comprising:
constructing a graph data structure comprising detected tag nodes and malware family nodes, and comprising indirect relationships between detected tags and malware families, wherein each detected tag node has one or more outgoing links (OGLs) to malware family nodes;
building a dictionary data structure comprising detected tag entries linking each detected tag to one or more malware family nodes based on the graph data structure;
identifying significant indirect entities (SIEs) within the detected tag entries of the dictionary data structure;
selecting a first SIE, of a plurality of SIEs, as a root node in a family tree data structure;
recursively connecting other SIEs, of the plurality of SIEs, to the root node in the family tree data structure based on OGLs of the SIEs in the plurality of SIEs; and
generating an identifier of the family tree data structure based on SIE identifiers for the SIEs.
|