US 12,229,259 B2
	Method and system for detecting malicious files in a non-isolated environment
Nikolay Sergeevich Prudkovskij, Moscow (RU)
Assigned to F.A.C.C.T. NETWORK SECURITY LLC, Moscow (RU)
Filed by F.A.C.C.T. NETWORK SECURITY LLC, Moscow (RU)
Filed on Jan. 27, 2022, as Appl. No. 17/586,010.
Application 17/586,010 is a continuation of application No. PCT/RU2020/000089, filed on Feb. 25, 2020.
Claims priority of application No. 2020107922 (RU), filed on Feb. 21, 2020.
Prior Publication US 2022/0164444 A1, May 26, 2022
Int. Cl. G06F 21/56 (2013.01)

CPC G06F 21/562 (2013.01) [G06F 2221/033 (2013.01)]

18 Claims

1. A computer-implementable method for detecting malicious files in non-isolated environment, the method comprising:

during a training phase:

acquiring plurality of executable files including at least one malicious executable file and at least one non-malicious executable file;

analyzing a binary form of the given executable file to obtain a first data associated with the given executable file,

the first data comprising a Byte/Entropy Histogram associated with the given executable file;

the analyzing comprising analyzing at least one selected from the group consisting of: byte n-grams, data of fields of the given executable file, a file section entropy, metadata of the binary form of the given executable file, and line length distribution histograms;

analyzing a disassembled form of the given executable file to obtain: (i) a second data associated with the given executable file; (ii) a control-flow graph associated with the given executable file, and (iii) a data-flow graph associated with the given executable file,

the second data being different from the first data;

determining based on data including the first and second data, parameters of the given executable file;

determining the parameters of the given executable file as being indicative of one of a malicious executable file and a non-malicious executable file;

generating, based on the parameters, at least a first feature vector and a second feature vector;

generating, based on the control-flow graph, a third feature vector;

generating, based on the data-flow graph, a fourth feature vector,

each one of the first feature vector, the second feature vector, the third feature vector, and the fourth feature vector being different from an other one thereof; and

training an ensemble of classifiers to determine if a given in-use executable file is one of malicious and non-malicious, the training comprising:

training a first classifier of the ensemble of classifiers based on the first feature vector;

training a second classifiers of the ensemble of classifiers based on the second feature vector;

training a third classifiers of the ensemble of classifiers based on the third feature vector; and

training a fourth classifiers of the ensemble of classifiers based on the fourth feature vector;

assigning to each one of the ensemble of classifiers, a respective decisive priority value,

the respective decisive priority value being indicative of a respective weight assigned to a prediction outcome of a given classifier of the ensemble of classifiers, the respective weight having been determined based on prediction accuracy of the given classifier.