CPC G06F 21/552 (2013.01) [G06F 21/52 (2013.01); G06N 20/10 (2019.01); G06N 20/00 (2019.01)] | 21 Claims |
1. An apparatus comprising:
interface circuitry to receive a plurality of files from a plurality of devices different than the apparatus;
machine readable instructions; and
one or more processor circuits to execute the machine readable instructions to:
determine respective first formats of the plurality of files, the plurality of files to be used to create a plurality of vector output files;
convert the plurality of files from the respective first formats to a second format, conversion of respective files based on the determination of the respective first formats of the plurality of files;
extract respective features from the respective files of the plurality of files, the respective files in the second format;
identify at least one respective group of contiguous characters in the respective features;
create the plurality of vector output files, respective vector output files including columns, respective columns including at least one number representative of an occurrence of the respective features; and
cause a machine learning algorithm to detect malware observed in at least one file of the plurality of files by outputting the plurality of vector output files to the machine learning algorithm, the plurality of vector output files formatted to be processed by the machine learning algorithm, the machine learning algorithm to analyze the respective features to detect the malware.
|