| CPC G06F 21/562 (2013.01) [G06F 2221/033 (2013.01)] | 18 Claims |

|
1. A computer-implementable method for detecting malicious files in non-isolated environment, the method comprising:
during a training phase:
acquiring plurality of executable files including at least one malicious executable file and at least one non-malicious executable file;
analyzing a binary form of the given executable file to obtain a first data associated with the given executable file,
the first data comprising a Byte/Entropy Histogram associated with the given executable file;
the analyzing comprising analyzing at least one selected from the group consisting of: byte n-grams, data of fields of the given executable file, a file section entropy, metadata of the binary form of the given executable file, and line length distribution histograms;
analyzing a disassembled form of the given executable file to obtain: (i) a second data associated with the given executable file; (ii) a control-flow graph associated with the given executable file, and (iii) a data-flow graph associated with the given executable file,
the second data being different from the first data;
determining based on data including the first and second data, parameters of the given executable file;
determining the parameters of the given executable file as being indicative of one of a malicious executable file and a non-malicious executable file;
generating, based on the parameters, at least a first feature vector and a second feature vector;
generating, based on the control-flow graph, a third feature vector;
generating, based on the data-flow graph, a fourth feature vector,
each one of the first feature vector, the second feature vector, the third feature vector, and the fourth feature vector being different from an other one thereof; and
training an ensemble of classifiers to determine if a given in-use executable file is one of malicious and non-malicious, the training comprising:
training a first classifier of the ensemble of classifiers based on the first feature vector;
training a second classifiers of the ensemble of classifiers based on the second feature vector;
training a third classifiers of the ensemble of classifiers based on the third feature vector; and
training a fourth classifiers of the ensemble of classifiers based on the fourth feature vector;
assigning to each one of the ensemble of classifiers, a respective decisive priority value,
the respective decisive priority value being indicative of a respective weight assigned to a prediction outcome of a given classifier of the ensemble of classifiers, the respective weight having been determined based on prediction accuracy of the given classifier.
|