| CPC G06F 21/56 (2013.01) [G06F 21/552 (2013.01)] | 23 Claims |

|
1. A computer system comprising at least one hardware processor configured to:
select a reduced subset of features from a plurality of features available for characterizing data samples, wherein selecting the reduced subset of features comprises:
dividing a collection of data samples acquired from a plurality of computing devices into a plurality of training corpora;
selecting a candidate feature from the plurality of features,
determining a first frequency distribution of feature values of the candidate feature over members of a first training corpus of the plurality of training corpora,
determining a second frequency distribution of feature values of the candidate feature over a second training corpus of the plurality of training corpora, and
determining whether to include the candidate feature into the reduced subset of features according to a similarity between the first and second frequency distributions; and
in response to selecting the reduced subset of features, train a threat detector to determine whether a target data sample is indicative of a computer security threat according to the reduced subset of features.
|