US 12,468,806 B2
System and method for converting antivirus scan to a feature vector
Robert J. Joyce, Odenton, MD (US); and Edward Simon Pastor Raff, Jamesville, NY (US)
Assigned to BOOZ ALLEN HAMILTON INC., McLean, VA (US)
Filed by Booz Allen Hamilton Inc., McLean, VA (US)
Filed on Sep. 27, 2023, as Appl. No. 18/475,601.
Claims priority of provisional application 63/489,445, filed on Mar. 10, 2023.
Prior Publication US 2024/0303331 A1, Sep. 12, 2024
Int. Cl. G06F 21/56 (2013.01); G06N 3/0442 (2023.01)
CPC G06F 21/561 (2013.01) [G06N 3/0442 (2023.01); G06F 2221/034 (2013.01)] 24 Claims
OG exemplary drawing
 
1. A method for generating a feature vector for malware, the method comprising:
storing, in memory of a computing device, program code for a trained neural network that produces embedded representations for antivirus scan data;
executing, by a processor of the computing device, the program code for the trained neural network, the neural network causing the computing device to be configured to perform the operations of:
(a) receiving an antivirus scan report (AVSR) for a malware file, the AVSR having a label including plural tokens that identify an antivirus product and attributes of the malware file;
(b) normalizing each label in the AVSR by separating each label into a sequence of tokens including a set of token strings;
(c) generating an input sequence for the malware file by embedding a first token and plural second tokens from the AVSR, wherein the first token identifies a start of the input sequence and each second token corresponds to the AVSR of the malware file;
(d) inputting the input sequence into a neural model for producing antivirus scan data; and
(e) outputting the antivirus scan data produced by the neural model as one or more feature vectors.