| CPC G06N 20/20 (2019.01) [G06N 5/01 (2023.01); G06N 20/00 (2019.01)] | 20 Claims |

|
1. A computer-implemented method for training a decision tree using a database system, the decision tree comprising a plurality of decision nodes, the computer-implemented method implemented by a network traffic management system comprising one or more network traffic apparatuses, one or more computing devices, or server devices, the method comprising:
storing in a database input data for training the decision tree, the input data comprising a plurality of feature values corresponding to a plurality of features, wherein the input data is stored in a columnar format, wherein rows of columnar format correspond to input sets of the input data in the database and columns correspond to at least one of the plurality of features, wherein the decision tree is trained using a set of training data comprising the input data, and wherein the set of training data does not comprise output data; and
generating a particular node of the plurality of decision nodes by:
selecting a subset of the plurality of features and a subset of the input data;
using one or more queries to the database system, for each feature of the subset of the plurality of features, calculating a variance value associated with the feature based on the subset of the input data, wherein the calculated variance value is based on aggregate label output values of partitioned subsets of the input data that are generated based on identified patterns relating to legitimacy of requests to one of the server devices, wherein the partitioned subsets of the input data are based on one or more conditions, wherein the aggregate label output values provide status information on conformity of the partitioned subsets to the identified patterns relating to legitimacy of the requests;
identifying a particular feature of the subset of the plurality of features associated with a highest variance value; and
associating the particular node with the particular feature, wherein the particular node causes the decision tree to branch based on the particular feature associated with the highest variance value.
|