CPC G06N 5/04 (2013.01) [G06F 16/258 (2019.01); G06N 20/00 (2019.01)] | 4 Claims |
1. An apparatus for preprocessing a security log, comprising at least one hardware processor configured to implement:
a field divider configured to divide a character string of a security log into a plurality of fields on the basis of a structure of the security log;
an ASCII code converter configured to convert a character string included in each of the plurality of divided fields into ASCII codes;
a vector data generator configured to generate vector data for each of the plurality of divided fields using the converted ASCII codes, the vector data comprising the converted ASCII codes and a length of the character string included in each of the plurality of divided fields; and
a learning server configured to train a machine learning-based prediction model to predict an intrusion using the vector data,
wherein the ASCII code converter is configured to convert a predetermined character among a plurality of characters included in the character string into a weighted ASCII code, wherein the predetermined character is a character used in an attack script included in the security log,
wherein a dimension of the vector data is determined based on a set maximum length of a character string for each of the plurality of divided fields,
wherein a value obtained by adding 1 to the set maximum length is determined to be the dimension of the vector data.
|