| CPC G06F 21/565 (2013.01) [G06F 21/562 (2013.01); G06N 20/00 (2019.01); G06N 20/10 (2019.01); G06F 8/53 (2013.01); G06F 2221/033 (2013.01)] | 20 Claims |

|
1. A computer-implemented method comprising:
identifying, by a knowledge module, static data points that may be indicative of either a harmful or benign executable file;
associating, by the knowledge module, the identified static data points with one of a plurality of categories of files, the plurality of categories of files including harmful files and benign files; identifying an executable file to be evaluated;
extracting, by the knowledge module, a plurality of static data points from the identified executable file;
generating a feature vector from the plurality of static data points using a classifier trained to classify the static data points based on training data, the training data comprising files known to fit into one of the plurality of categories of files, wherein one or more features of the feature vector generated using the classifier are selectively turned on or off, wherein the one or more features are selectively turned on or off based one or more values of the static data points being within a predetermined range; and
providing the generated feature vector to a support vector machine to build a probabilistic model that indicates whether the executable file fits into one of the categories of files, the generated feature vector comprising at least one feature that has been selectively turned off.
|