US 12,189,716 B1
Predicting likelihood of request classifications using deep learning
Yi Liao, Apex, NC (US); Artin Armagan, Raleigh, NC (US); Phoemphun Oothongsap, Raleigh, NC (US); Brian Christopher Hare, Raleigh, NC (US); Adheesha Sanjaya Arangala, Chapel Hill, NC (US); and Jin-Whan Jung, Raleigh, NC (US)
Assigned to SAS Institute Inc., Cary, NC (US)
Filed by SAS Institute Inc., Cary, NC (US)
Filed on May 23, 2024, as Appl. No. 18/672,589.
Claims priority of provisional application 63/557,973, filed on Feb. 26, 2024.
Int. Cl. G06N 3/086 (2023.01); G06F 18/21 (2023.01); G06F 18/214 (2023.01); G06F 18/22 (2023.01); G06N 3/045 (2023.01); G06N 3/084 (2023.01)
CPC G06F 18/214 (2023.01) [G06F 18/217 (2023.01)] 30 Claims
OG exemplary drawing
 
1. A non-transitory computer-readable medium having computer-readable instructions stored thereon that when executed by a processor cause the processor to:
train a deep machine learning model to compute a training score by:
(A) creating a set of training data associated with a plurality of training requests;
(B) inputting the set of training data and a set of weights into the deep machine learning model to compute the training score for each of the plurality of training requests;
(C) comparing the training score of each training request of the plurality of training requests with an expected score for that training request;
(D) responsive to determining that the training score of a predetermined percentage of the plurality of training requests is outside a predetermined threshold of the expected score, computing a loss function;
(E) adjusting the set of weights based on the loss function; and
(F) repeating (A)-(E) until the training score of the predetermined percentage of the plurality of training requests is within the predetermined threshold of the expected score to obtain a trained deep machine learning model;
receive a first set of variables associated with a real-time request;
extract a predetermined subset of the first set of variables to generate a second set of variables for the real-time request;
identify historical request data associated with a predetermined number of historical requests, wherein the historical requests are identified based on the real-time request;
compute a set of parameters based on the first set of variables and the historical request data;
generate a plurality of sequences for the real-time request, wherein the plurality of sequences comprise a plurality of numeric sequences and a plurality of string sequences, wherein each of the plurality of numeric sequences and each of the plurality of string sequences comprises a plurality of values of a specific attribute type, and wherein the plurality of values are selected from the second set of variables, the set of parameters, and the historical request data;
convert each of the plurality of string sequences into an encoded string sequence to obtain a plurality of encoded string sequences;
input the plurality of numeric sequences and the plurality of encoded string sequences into the trained deep machine learning model; and
compute a score from the trained deep machine learning model, the score indicative of a likelihood that the real-time request belongs to an unauthorized classification.