US 12,135,786 B2
	Method and system for identifying malware
Nikolay Sergeevich Prudkovskij, Moscow (RU); and Dmitry Aleksandrovich Volkov, Moscow (RU)
Assigned to F.A.C.C.T. NETWORK SECURITY LLC, Moscow (RU)
Filed by F.A.C.C.T. NETWORK SECURITY LLC, Moscow (RU)
Filed on Mar. 3, 2022, as Appl. No. 17/685,588.
Application 17/685,588 is a continuation of application No. PCT/RU2020/000140, filed on Mar. 16, 2020.
Claims priority of application No. 2020110068 (RU), filed on Mar. 10, 2020.
Prior Publication US 2022/0188417 A1, Jun. 16, 2022
Int. Cl. G06F 21/00 (2013.01); G06F 21/53 (2013.01); G06F 21/56 (2013.01)

CPC G06F 21/566 (2013.01) [G06F 21/53 (2013.01); G06F 2221/033 (2013.01)]

8 Claims

1. A computer-implementable method for training an ensemble of classifiers to determine malware families of malware, the method comprising:

receiving a given sample of training malware of a plurality of samples of training malware;

analyzing the given sample of training malware in an isolated environment;

generating a respective behavioral report including indications of actions executed by the given sample of training malware in the isolated environment;

identifying within the respective behavioral reports associated with each one of the plurality of samples of training malware, a report group of behavioral reports associated with the samples of training malware of a given malware family;

determining by analyzing actions in the report group associated with the given malware family, reference actions common to every sample of training malware in the given malware family;

generating for a given behavioral report of the report group, a respective training feature vector of a respective plurality of training feature vectors associated with the given malware family, generating a given value of the respective training feature vector comprises:

determining whether a receptive field of the given behavioral report corresponds to a respective reference action associated with the given malware family;

training a given classifier of the ensemble of classifiers, based on the respective plurality of training feature vectors to determine if a given in-use sample of malware is of the given malware family; and

using the ensemble of classifiers to identify the given in-use sample of malware by:

receiving the given in-use sample of malware;

analyzing the given in-use sample of malware in the isolated environment;

generating an in-use behavioral report including indications of actions executed by the given in-use sample of malware;

generating a given in-use feature vector associated with the given in-use sample of malware,

a given value of the in-use feature vector being generated based on data in a given field of the in-use behavioral report which corresponds to a respective reference action associated with a respective malware family;

feeding the given in-use feature vector to a respective classifier the ensemble of classifiers associated with the respective malware family to generate a prediction outcome indicative of whether the given in-use sample of malware is of the respective malware family or not; and

storing data of the prediction outcome in association with the given in-use sample of malware for further use in the identifying the malware.